Your Meetings

# Wordloop Platform (/docs)


{/* LLM-Context: TL;DR:
  This is the root index of the Wordloop Platform documentation.
  Wordloop is a monorepo consisting of:
  - wordloop-core (Go, Port 4002): REST API, DB (pgvector), domain logic.
  - wordloop-ml (Python/FastAPI, Port 4003): AI/ML tasks, transcription.
  - wordloop-app (Next.js, Port 4001): Web frontend with SSR.
  Core routing philosophy: Trace-First Development.
  Dependencies mapping: Check knowledge-graph.json.
  */}


# Wordloop Platform [#wordloop-platform]

Meeting transcription, speaker identification, and AI-powered conversation intelligence.

## Services [#services]

| Service                                       | Language         | Port | Role                                   |
| --------------------------------------------- | ---------------- | ---- | -------------------------------------- |
| [wordloop-core](learn/services/core/index.md) | Go               | 4002 | REST API, domain logic, database       |
| [wordloop-ml](learn/services/ml/index.md)     | Python / FastAPI | 4003 | Transcription, speaker embeddings, LLM |
| [wordloop-app](learn/services/app/index.md)   | Next.js          | 4001 | Web frontend                           |

## Architecture at a glance [#architecture-at-a-glance]

<Mermaid
  chart="`graph LR
  APP[&#x22;wordloop-app :4001&#x22;] --> CORE[&#x22;wordloop-core :4002&#x22;]
  CORE --> PG[(Postgres)]
  CORE --> PS[&#x22;GCP Pub/Sub&#x22;]
  CORE --> GCS[&#x22;GCP Storage&#x22;]

  ML[&#x22;wordloop-ml :4003&#x22;] --> PS
  ML --> GCS
  ML --> CORE`"
/>

## Navigating the Documentation [#navigating-the-documentation]

If you are new to the platform, we recommend following the sidebar from top to bottom:

1. **[Principles](principles/index.mdx)** — Start by understanding our core philosophy, engineering values, and system constraints.
2. **[Architecture](learn/architecture/overview.mdx)** — See how those principles are applied structurally across the system and infrastructure.
3. **[Development](start/quickstart.md)** — Learn how to spin up the entire platform locally via our custom `./dev` CLI.
4. **Services** — Dive deep into specific implementations for [Core](learn/services/core/index.md), [ML](learn/services/ml/index.md), and [App](learn/services/app/index.md).
5. **API & Schemas** — Reference material for system contracts.


# Postgres with pgvector as the production vector store (/docs/decisions/0001-postgres-for-vector-search)


# 0001 — Postgres with `pgvector` as the production vector store [#0001--postgres-with-pgvector-as-the-production-vector-store]

**Status:** Accepted
&#x2A;*Date:** 2026-04-19
&#x2A;*Deciders:** core platform
&#x2A;*Supersedes:** —
&#x2A;*Superseded by:** —

## Context [#context]

Wordloop generates and stores embeddings for transcript chunks, speaker utterances, and recap summaries. A retrieval-augmented generation (RAG) workflow at read time uses these embeddings to supply context to model calls.

The default instinct when adding a GenAI feature is to reach for a dedicated vector database — Pinecone, Milvus, Weaviate, or similar. These systems offer specialised ANN indexes, horizontal scale, and purpose-built tooling. At our current scale, they also introduce an operational surface we do not need and a split-brain failure mode we actively want to avoid.

Embeddings in Wordloop are not an island. They exist **because** a specific transcript chunk exists. They must appear atomically with the chunk, be removed atomically when the chunk is removed, and obey the same authorisation rules the chunk does. A system where the transcript lives in Postgres and its embedding lives in a separate service that is updated "eventually" is a system where queries will silently return embeddings for deleted content or miss content that was just created — neither of which is acceptable.

## Decision [#decision]

Use PostgreSQL with the `pgvector` extension as the single production vector store. Embeddings live on the row they describe (or in a sibling table joined by primary key), committed in the same transaction as their source data.

## Consequences [#consequences]

**Atomic writes.** Inserting a transcript chunk and its embedding happens in one transaction. If the embedding fails to compute or save, the chunk rolls back. There is no asynchronous reconciliation process and no inconsistency window.

**One operational surface.** The database we already run, already back up, already monitor, already manage migrations for, is also the vector store. No second system to provision, secure, or teach on-call about.

**One authorisation model.** The row-level security rules that protect transcript data also protect the embeddings. We do not have to re-implement access control in a second system and hope the two models agree.

**Adequate performance at current scale.** `pgvector`'s IVFFlat and HNSW indexes are sufficient for our current and projected vector counts. We benchmark quarterly; we have not approached the scale where a purpose-built vector database would outperform `pgvector` by a margin that justifies the operational cost.

## Alternatives considered [#alternatives-considered]

* **Pinecone, Milvus, Weaviate.** Rejected for the split-brain failure mode and the second operational surface. Revisit if vector count per tenant exceeds \~10M and `pgvector` benchmarks degrade materially.
* **Embeddings in a denormalised column with in-Go cosine comparison.** Rejected for O(n) query cost — acceptable for small datasets in prototypes, unacceptable in production.
* **Embeddings in an object store with a hand-rolled ANN index.** Rejected for the cost of maintaining the index and the absence of transactional guarantees.

## Debt annotation [#debt-annotation]

**Principal:** None beyond the `pgvector` extension install, which is a single SQL statement per environment.

**Interest:** Low. `pgvector` is actively maintained and widely deployed; index tuning (IVFFlat `lists`, HNSW `ef_construction`) is a one-time cost per table.

**Multiplier:** Vector count per tenant. If a single tenant's embedding set grows beyond the point where `pgvector`'s ANN indexes outperform full scan by a useful margin — empirically, in the tens of millions — revisit this decision. The migration path is well-understood (dual-write, shadow-read, cut over), but non-trivial.

## Verification [#verification]

* `SELECT extname FROM pg_extension WHERE extname = 'vector';` returns a row on every environment.
* Transcript insertion and embedding insertion appear in the same transaction log entry.
* No application code writes to an external vector service.

## Related [#related]

* [Postgres stack principle](/docs/principles/stack/postgres)
* [AI Engineering principle](/docs/principles/ai-native/ai-engineering)


# Next.js with Server Components for the web app (/docs/decisions/0002-nextjs-ssr-for-app)


# 0002 — Next.js with Server Components for `wordloop-app` [#0002--nextjs-with-server-components-for-wordloop-app]

**Status:** Accepted
&#x2A;*Date:** 2026-04-19
&#x2A;*Deciders:** app platform
&#x2A;*Supersedes:** —
&#x2A;*Superseded by:** —

## Context [#context]

The Wordloop web app renders deeply nested AI-derived context: a Meeting contains TranscriptSegments, each segment has a speaker attribution (Person), the Meeting has a MeetingSynthesis with Topics and TalkingPoints, and a list of Tasks. Opening a Meeting is the single most common view in the product.

A client-side single-page application fetching this context produces a cascading waterfall. The client first fetches the Meeting, waits for the response, fetches the Transcription, waits, fetches Segments, waits, resolves Person records per speaker, waits, fetches the MeetingSynthesis, waits, fetches Tasks. Each hop is a full round trip between the browser and the edge — in practice, five to seven seconds of blank screen on a median connection before any meaningful content appears.

This is not a problem to optimise with skeleton screens or lazy loading. The waterfall is inherent to the data shape and the client-side fetch model.

## Decision [#decision]

Build `wordloop-app` on Next.js with the App Router and React Server Components. Meeting views, synthesis views, and the dashboard fetch their data on the server, close to the database, in a single request trip. The client receives the fully resolved DOM with content already present.

Client components remain where interactivity demands them: the live transcript stream, the editor, the command palette. These are bounded, named islands inside a server-rendered shell.

## Consequences [#consequences]

**Single round trip for the primary view.** Opening a Meeting is one request from the browser; all downstream data fetches happen server-side in parallel, close to the database. Time-to-meaningful-paint drops from seconds to hundreds of milliseconds.

**Database queries colocate with the code that needs them.** A Server Component can query Postgres directly (through our Go API in practice, but the programming model is the same: the fetch happens where the latency cost is lowest).

**Client bundles stay small.** Components that never run on the client are never shipped to the client. The JavaScript bundle for the Meeting view is a fraction of what it would be in a pure-SPA architecture.

**A sharper client/server boundary.** Server Components cannot use `useState`, `useEffect`, or browser APIs. The boundary is explicit and enforced by the framework, which catches a common class of hydration bugs at build time.

## Alternatives considered [#alternatives-considered]

* **Pure client-side React + Vite.** Rejected for the waterfall problem described above. Viable only if the data shape were flat, which it is not.
* **Remix / TanStack Start / other RSC-capable frameworks.** Considered equivalent in principle. Next.js chosen for the ecosystem maturity, the production track record of the App Router at our scale, and the team's existing expertise. Revisit if Next.js' direction diverges from our needs.
* **Hybrid: SPA shell + server-rendered HTML snippets.** Rejected for the cognitive overhead of maintaining two rendering models. Server Components give us the same benefit with a single programming model.

## Debt annotation [#debt-annotation]

**Principal:** Moderate. The team has internalised the Server/Client Component boundary; new engineers spend their first week understanding when to use which.

**Interest:** Low to moderate. Next.js ships breaking changes in major versions; we pin and plan upgrades quarterly. The RSC model itself is stable.

**Multiplier:** Framework direction. If Next.js' architectural direction diverges materially from our needs, the cost of migrating is proportional to the size of the app. The Server Components abstraction is portable — Remix and TanStack Start implement the same conceptual model — so the migration risk is bounded.

## Verification [#verification]

* Primary Meeting view renders meaningful content in a single round trip (observed in Core Web Vitals on production).
* `next build` output shows Server Components are not included in client chunks.
* No data-fetch waterfalls in the Network panel for the dashboard or Meeting view.

## Related [#related]

* [Frontend stack principle](/docs/principles/stack/frontend)
* [App Service handbook](/docs/learn/services/app)


# Stateful containers for the ML service (/docs/decisions/0003-stateful-containers-for-ml)


# 0003 — Stateful containers for `wordloop-ml` [#0003--stateful-containers-for-wordloop-ml]

**Status:** Accepted
&#x2A;*Date:** 2026-04-19
&#x2A;*Deciders:** ml platform
&#x2A;*Supersedes:** —
&#x2A;*Superseded by:** —

## Context [#context]

The ML service is responsible for real-time transcription of live Meeting audio, MeetingSynthesis generation from finalised Transcriptions, and embedding generation for retrieval. The transcription path is latency-critical: from the moment a person speaks to the moment the caption renders, the user-perceived budget is under one second.

Serverless function platforms — Lambda, Cloud Run with scale-to-zero, Vercel Edge — are excellent for bursty, stateless workloads with tolerant latency budgets. They are a poor fit for workloads that require:

1. Large model weights loaded into memory (several hundred MB to several GB).
2. Connection-level state for streaming audio frames.
3. Cold start times measured in seconds, which translate directly into user-visible silence during a live meeting.

A cold start of five to ten seconds on the first segment of a Meeting destroys the real-time experience. Warm-up pings mitigate but do not eliminate this, and the cost of keeping a serverless function permanently warm approaches the cost of a dedicated container.

## Decision [#decision]

Run `wordloop-ml` as long-lived FastAPI workers inside orchestrated containers. Models are loaded at container start and remain resident across requests. The container is the unit of scaling — we scale horizontally by adding more containers, not by spinning up more cold functions.

## Consequences [#consequences]

**Models stay warm.** The first segment of a Meeting transcribes with the same latency as the hundredth. No cold-start penalty on the user-visible path.

**Streaming state is preserved.** An audio stream's position, rolling buffer, and partial transcription state live in the container that handles the stream. No cross-invocation state-reconstruction step.

**Operational posture matches a normal service.** The ML service has rolling deploys, health checks, graceful shutdown, and horizontal scaling — the same operational shape as `wordloop-core`. On-call engineers use the same mental model.

**We pay for idle capacity.** A serverless model would scale to zero at night; our containers do not. At current traffic this is cheaper than the alternative (warm-keeping costs in a serverless model exceed the dedicated container cost), but the crossover point will change with usage patterns.

## Alternatives considered [#alternatives-considered]

* **Lambda / Cloud Functions with scale-to-zero.** Rejected for cold-start latency on the transcription hot path.
* **Cloud Run with always-on minimum instances.** Considered, and a reasonable alternative. We chose explicit container orchestration because it also handles the streaming-state requirement cleanly; Cloud Run's per-request model is awkward for long-lived WebSocket-adjacent connections. Revisit if Cloud Run's streaming support matures.
* **Dedicated GPU nodes.** Not yet required — our current model mix runs adequately on CPU. If we adopt models that demand GPU inference, the decision to run stateful containers still holds; we add GPU node pools.
* **Batch transcription only (no real-time path).** Rejected as a product decision — live transcription is a core Wordloop feature.

## Debt annotation [#debt-annotation]

**Principal:** Moderate. Operating a stateful service means we handle graceful shutdown, connection draining, and rolling-deploy choreography ourselves. This is well-trodden ground and our Go core already does the same.

**Interest:** Steady. Container images must be rebuilt when model weights or the Python runtime update; that is a normal CI cost.

**Multiplier:** Model size. If model weights grow past what fits comfortably in a container's memory budget (low single-digit GB), we may need to split inference into a dedicated model-serving layer (Triton, Ray Serve) fronted by thin FastAPI workers. The service boundary stays the same; the implementation changes.

## Verification [#verification]

* Time-to-first-caption on a cold Meeting start is under one second at p95 (observed in production latency dashboards).
* No cold-start warm-up hack exists in the deploy pipeline (no scheduled pings, no keep-warm loop).
* Model weights are loaded exactly once per container process, at boot.

## Related [#related]

* [ML Systems stack principle](/docs/principles/stack/ml-systems)
* [Real-Time system-design principle](/docs/principles/system-design/real-time)
* [ML Service handbook](/docs/learn/services/ml)


# Hosting-layer Link header for llms.txt discovery (/docs/decisions/0004-hosting-layer-llms-txt-link-header)


# 0004 — Hosting-layer `Link` header for `llms.txt` discovery [#0004--hosting-layer-link-header-for-llmstxt-discovery]

**Status:** Accepted
&#x2A;*Date:** 2026-04-19
&#x2A;*Deciders:** docs platform
&#x2A;*Supersedes:** —
&#x2A;*Superseded by:** —

## Context [#context]

The `llms.txt` specification recommends that sites advertise their machine-readable index via the HTTP `Link` header with `rel="llms-txt"`, in addition to serving the file at `/llms.txt`. This lets agents discover the index without guessing at conventional paths and without parsing HTML.

The Wordloop documentation site is built with Next.js 15 in static-export mode (`output: 'export'`). Static export does not support runtime middleware, route handlers that mutate response headers, or `next.config.js` `headers()` for the exported bundle — those hooks are only honoured by the Node server, which we are not running in production. Consequently, the `Link` header cannot be set at the framework layer.

The site is served by Firebase Hosting, which supports per-path response headers declaratively in `firebase.json` under `hosting.headers`.

## Decision [#decision]

Set the `Link: </llms.txt>; rel="llms-txt", </llms-full.txt>; rel="llms-full-txt"` header on every response from Firebase Hosting, via the `firebase.json` `hosting.headers` array. Additionally, set `Content-Type: text/markdown; charset=utf-8` on every `**/*.md` path so the per-page markdown exports are served with the correct media type, and `text/plain` on the two `llms*.txt` files.

The header applies to all paths (`source: "/**"`). The `rel` advertisement is cheap and universally safe — every Wordloop documentation page is a valid entry point for an agent that then looks up the index.

## Consequences [#consequences]

* Agents following the `llms.txt` discovery pattern via `curl -I` or a HEAD request find the index without needing to hardcode `/llms.txt`.
* The `.md` exports of each documentation page are served with the correct MIME type; command-line tooling (`curl`, `wget`) treats them as text.
* The configuration lives in `firebase.json` — a hosting-platform-specific file. If we ever migrate hosting providers, this configuration has to be reimplemented in the new provider's equivalent. This is captured in the debt annotation below.

## Alternatives considered [#alternatives-considered]

* **Set the header in a Next.js middleware.** Rejected: middleware is incompatible with static export.
* **Set the header via a meta tag in `<head>`.** Rejected: meta equivalents of the `Link` header (`<link rel="llms-txt" href="...">`) are not part of the spec and not observed by agents doing header-only HEAD requests.
* **Add an Express shim in front of the static export.** Rejected: introducing a server just to set one header sacrifices the operational simplicity that motivated static export in the first place.
* **Rely on convention only (`/llms.txt` at the root).** Rejected: the spec explicitly recommends the header. It is cheap to set and the canonical way for agents to discover the index.

## Debt annotation [#debt-annotation]

**Principal:** \~1 hour. One `firebase.json` edit, one ADR, one test.

**Interest:** Near-zero. The configuration does not drift; the header string is stable.

**Multiplier:** Hosting migration. If we move off Firebase, the `firebase.json` block has to be translated to the new hosting provider's header syntax. The content of the header does not change; only the declaration site does. If we ever move to a self-hosted Next.js runtime, the header moves to middleware and `firebase.json` can be discarded.

## Verification [#verification]

* `curl -I https://docs.wordloop.ai/docs/learn/architecture/overview` shows the `Link` header with both `llms-txt` and `llms-full-txt` targets.
* `curl -I https://docs.wordloop.ai/docs/learn/architecture/overview.md` returns `Content-Type: text/markdown; charset=utf-8`.
* `curl -I https://docs.wordloop.ai/llms.txt` returns `Content-Type: text/plain; charset=utf-8`.

## Related [#related]

* [Documentation principle](/docs/principles/foundations/documentation) — the dual-audience stance this header operationalises.
* [Agent-Native Systems](/docs/principles/ai-native/agent-native-systems) — the broader principle the discovery mechanism serves.


# Docs are canonical knowledge and skills are the agent execution layer (/docs/decisions/0005-docs-canonical-skills-execution-layer)


# 0005 — Docs are canonical knowledge and skills are the agent execution layer [#0005--docs-are-canonical-knowledge-and-skills-are-the-agent-execution-layer]

**Status:** Accepted
&#x2A;*Date:** 2026-05-01
&#x2A;*Deciders:** docs platform, agent tooling
&#x2A;*Supersedes:** —
&#x2A;*Superseded by:** —

## Context [#context]

Wordloop maintains both a documentation site and a set of agent skills. The docs site is built for humans and agents: it publishes navigable pages, `llms.txt`, `llms-full.txt`, per-page Markdown exports, and MCP resources. The skills are loaded by AI agents to guide task execution.

The previous stance kept docs and skills as fully separate surfaces. That avoided prompt-like content leaking into the docs site, but it also created a drift risk: durable engineering policy could be duplicated in both docs and skill files. We have already seen signs of this class of drift, such as stack-version claims differing between service docs and package metadata.

Modern skill design favours progressive disclosure: concise trigger metadata, a short operating contract, and selective loading of deeper references. This means skill files should not become large documentation mirrors. They should tell the agent what to read, how to act, and how to verify.

## Decision [#decision]

The documentation site is the canonical source for durable engineering knowledge. Agent skills are the execution layer that selects, loads, and applies that knowledge safely.

A docs page owns:

* Principles and architecture guidance.
* Service handbooks and implementation conventions.
* Workflow guides and runbooks.
* ADRs and decision history.
* Generated reference material from specs, schemas, and code.
* Glossary and domain vocabulary.

A skill owns:

* Triggering and task routing.
* Which docs pages to read for each task shape.
* Tool usage, command sequencing, and safety gates.
* Verification steps and eval discipline.
* Agent-specific constraints that do not belong in human-facing docs.

Skills may reference docs pages by slug or MCP resource. Docs pages must not depend on skill internals for their meaning.

## Consequences [#consequences]

* Durable guidance has one canonical maintenance path.
* Human and agent readers consume the same engineering knowledge.
* Skills remain smaller, more triggerable, and easier to evaluate.
* Documentation changes can identify affected skills through a skill-to-doc map.
* Skill changes can identify which canonical docs pages need review.
* The docs site needs stronger freshness, metadata, and health checks because more agent behaviour depends on it.

## Alternatives considered [#alternatives-considered]

* **Keep docs and skills completely separate.** Rejected because it preserves duplicated policy and makes drift a review-discipline problem only.
* **Move most docs into skills.** Rejected because skills are not a good human-reading surface and large skill files weaken progressive disclosure.
* **Have skills fetch arbitrary public documentation at runtime.** Rejected as the default because public retrieval introduces prompt-injection and freshness risks. Trusted local docs, generated Markdown exports, and the Wordloop MCP server are the default context path.
* **Generate skills entirely from docs.** Deferred. It may become useful for simple doc-reference sections, but skill trigger wording and safety gates still need deliberate evaluation.

## Debt annotation [#debt-annotation]

**Principal:** Medium. We need a skill-to-doc map, workflow docs, freshness metadata, and documentation health checks.

**Interest:** Low if automated checks run in CI; high if this remains a manual checklist.

**Multiplier:** Agent autonomy. The more agents rely on docs for task execution, the more expensive stale docs become.

## Verification [#verification]

* Each maintained skill declares its canonical docs dependencies in the skill-to-doc map.
* Documentation health checks validate mapped docs pages exist.
* Stale active docs are flagged by review cadence.
* Skill updates include a docs review step.
* Docs updates include an affected-skills review step.

## Related [#related]

* [Documentation](/docs/principles/foundations/documentation)
* [Agent-Native Systems](/docs/principles/ai-native/agent-native-systems)
* [Keep Docs and Skills in Sync](/docs/guides/keep-docs-and-skills-in-sync)
* [Correct Documentation Drift](/docs/guides/correct-documentation-drift)


# Architecture Decision Records (/docs/decisions)


# Architecture Decision Records [#architecture-decision-records]

An ADR is how we remember *why*. Code shows what we built; commit history shows when it changed; ADRs show which options we rejected, what tradeoffs we accepted, and what debt we took on. The log is **append-only**: once an ADR is accepted, it is never edited — only superseded.

## Why ADRs matter on this team [#why-adrs-matter-on-this-team]

Two years from now, an engineer — or an agent — will look at a piece of Wordloop and ask "why is this like this?" The answer lives in the ADR. Without it, every design decision regresses to "this is how it was when I got here," and the team loses the ability to challenge decisions on their merits because the merits have been forgotten. We write ADRs for decisions that will be expensive to reverse and decisions that will surprise a reader who does not share our context.

## Statuses [#statuses]

| Status         | Meaning                                           |
| -------------- | ------------------------------------------------- |
| **Proposed**   | Authored but not yet accepted. Under discussion.  |
| **Accepted**   | Current, in force.                                |
| **Rejected**   | Considered and declined, with reasoning.          |
| **Deprecated** | No longer applicable, but historically important. |
| **Superseded** | Replaced by a later ADR (which links back).       |

## Log [#log]

*The catalogue populates as decisions are committed. Each entry includes title, status, author, date, and a Principal / Interest / Multiplier debt annotation — see [Engineering Principles / Documentation](/docs/principles/foundations/documentation) for the model.*

<Callout type="info">
  Authoring a new ADR? Copy the frontmatter and 7-section structure from any existing ADR in this directory. The title is the decision in plain language; the filename is `NNNN-kebab-case-decision.mdx` with the next available number.
</Callout>


# Add an API Endpoint (/docs/guides/add-api-endpoint)


# Add an API Endpoint [#add-an-api-endpoint]

## Goal [#goal]

Add a new endpoint to `wordloop-core`, following the spec-first workflow so that the server handler, the TypeScript client, and the reference docs all stay aligned.

## Prerequisites [#prerequisites]

* Local stack running (`./dev start all`) — see [Quickstart](/docs/start/quickstart).
* Familiarity with [API Design](/docs/principles/system-design/api-design) and [Hexagonal Architecture](/docs/principles/system-design/hexagonal-architecture) principles.

## Steps [#steps]

### 1. Update the OpenAPI spec [#1-update-the-openapi-spec]

The spec is the source of truth. Open `specs/core-openapi.json` and add your endpoint:

* Path, method, operationId.
* Request and response schemas with descriptions on every field.
* Example payloads.
* Error responses mapped to our standard error codes ([Reference / Errors](/docs/reference/errors)).

### 2. Regenerate handlers and clients [#2-regenerate-handlers-and-clients]

```bash
./dev generate core
```

This produces the server-side handler stub and the TypeScript client surface. See [Code Generation](/docs/guides/code-generation) for details on what runs under the hood.

### 3. Implement the handler [#3-implement-the-handler]

Fill in the generated handler stub. Handlers stay thin — extract inputs, call the application service, shape the response. Business rules belong in the domain; orchestration belongs in the application service.

### 4. Write a service test [#4-write-a-service-test]

In the handler's test file, spin up the Testcontainers Postgres, make the HTTP call, assert on behaviour and on the OTel trace shape. See [Testing](/docs/principles/foundations/testing) for the discipline.

### 5. Run the relevant checks [#5-run-the-relevant-checks]

```bash
./dev lint core
./dev test core
```

## Verification [#verification]

* `./dev test core` passes.
* The [Core API Reference](/docs/reference/api/core) renders the new endpoint automatically.
* Hitting the endpoint from the local frontend produces the expected response.

## Troubleshooting [#troubleshooting]

* **Generated code is out of date.** Re-run `./dev generate core` and commit the generated files.
* **Testcontainers failing to start.** Check `./dev status` and that Docker is running.
* **Frontend cannot reach the endpoint.** The frontend uses the generated TypeScript client; re-running generation and restarting the Next.js dev server usually fixes it.

See [API Design](/docs/principles/system-design/api-design) for the stance this workflow expresses.


# Add a Service (/docs/guides/add-service)


# Add a Service [#add-a-service]

## Goal [#goal]

Scaffold a new backend service that conforms to our platform conventions — hexagonal structure, OTel instrumentation, standard CI pipeline, `./dev` integration — from day one.

## Prerequisites [#prerequisites]

* An accepted [ADR](/docs/decisions) justifying the new service. "We could just add this to `wordloop-core`" is often the right answer; the ADR documents why it is not.
* Familiarity with [Hexagonal Architecture](/docs/principles/system-design/hexagonal-architecture), the [Platform](/docs/principles/delivery/platform) stance, and [Go Services](/docs/principles/stack/go-services) or [ML Systems](/docs/principles/stack/ml-systems) depending on the language.

## Steps [#steps]

### 1. Use the scaffolding template [#1-use-the-scaffolding-template]

Our platform ships a bootstrapping template per supported language. It produces:

* The hexagonal directory layout (`domain/`, `ports/`, `adapters/`, `application/`).
* A stub HTTP server with OTel instrumentation configured.
* A standard CI pipeline definition.
* Dockerfile and Cloud Run deployment config.
* `./dev` integration (start, stop, logs, test, lint).

### 2. Register the service with the platform [#2-register-the-service-with-the-platform]

Add the service to the platform's service registry so that shared tooling — observability, feature flags, secrets — knows it exists. This is the step that makes the service "real" to the rest of the platform.

### 3. Write the first ADR [#3-write-the-first-adr]

A new service is a decision. Capture its purpose, its expected ownership, and the debt it carries (runtime cost, operational surface, coordination overhead) as an ADR.

### 4. Define the service's first SLO [#4-define-the-services-first-slo]

Before the service receives traffic, define the user-facing SLO it will live inside ([Reliability](/docs/principles/quality/reliability)). An SLO-less service is a service that nobody can defend.

### 5. Write the service handbook [#5-write-the-service-handbook]

Create `content/docs/learn/services/<name>/` with `index.mdx`, `architecture.mdx`, and `implementation.mdx`. The handbook explains the "why" that the code cannot.

## Verification [#verification]

* `./dev start <name>` starts the service cleanly.
* `./dev test <name>` passes.
* The service is visible on the platform observability dashboard.
* A fresh engineer can open the service handbook and understand the shape.

## Troubleshooting [#troubleshooting]

* **OTel not exporting.** Check that the service registered its collector endpoint; the template defaults should work but custom configuration may override.
* **CI failing on the first push.** The template ships a minimal CI pipeline; extend it with service-specific tests as needed.

See [Platform](/docs/principles/delivery/platform) for the broader stance on service scaffolding.


# Code Generation (/docs/guides/code-generation)


# Code Generation [#code-generation]

The platform uses code generation pipelines to keep API contracts in sync across all services.

## Event types (AsyncAPI) [#event-types-asyncapi]

The AsyncAPI specification in `services/wordloop-core/asyncapi.yaml` is the single source of truth for all event-driven types (WebSocket events and Pub/Sub messages).

```bash
# Compile AsyncAPI spec to typed internal Events for all services
./dev gen events
```

This produces:

| Target         | Tool                       | Output                                                                     |
| -------------- | -------------------------- | -------------------------------------------------------------------------- |
| **Go**         | `asyncapi-codegen`         | `services/wordloop-core/internal/provider/generated/asyncapi.gen.go`       |
| **TypeScript** | `@asyncapi/cli` (Modelina) | `services/wordloop-app/lib/generated/asyncapi.ts`                          |
| **Python**     | `@asyncapi/cli` (Modelina) | `services/wordloop-ml/src/wordloop/providers/generated/asyncapi_models.py` |

Consumer scripts (App, ML) try to fetch the spec from a running Core instance at `http://localhost:4002/asyncapi.yaml` first, and fall back to the local monorepo path for offline generation.

:::info
Core owns the spec and generates its own types locally. App and ML are consumers that pull the spec from Core — following the same pattern as OpenAPI client generation.
:::

## Core → ML client (oapi-codegen) [#core--ml-client-oapi-codegen]

`wordloop-core` generates a Go HTTP client for calling `wordloop-ml`'s API.

```bash
# Core must be running at localhost:4002 and ML at localhost:4003
./dev gen clients
```

Under the hood:

```bash
cd services/wordloop-core
WORDLOOP_ML_BASE_URL=http://127.0.0.1:4003 ./scripts/generate-clients.sh
```

**Adding a new external API client in Core:**

1. Create `internal/provider/<name>/`
2. Add an `oapi-codegen.yaml` config in that directory
3. Set `<NAME>_BASE_URL` when running the script

## ML → Core client (openapi-python-client) [#ml--core-client-openapi-python-client]

`wordloop-ml` generates a Python client for calling `wordloop-core`'s API.

```bash
# Generated simultaneously alongside Core's
./dev gen clients
```

Under the hood:

```bash
cd services/wordloop-ml
./scripts/generate_wordloop_core_client.sh
```

The generated client is written to `src/wordloop/providers/wordloop_core/client/` and **must not be edited manually**.

## App TypeScript client (Orval) [#app-typescript-client-orval]

`wordloop-app` generates TypeScript types, SWR hooks, and API functions from Core's OpenAPI spec.

```bash
# Generated simultaneously via Orval
./dev gen clients
```

Under the hood:

```bash
curl http://localhost:4002/openapi.json -o services/wordloop-app/openapi.json
cd services/wordloop-app && pnpm orval
```

The generated file is `lib/api/generated.ts` — **never edit it manually**. Use the wrapper hooks in `hooks/use-data.ts`.

## Regenerate everything [#regenerate-everything]

```bash
# All services must be running for clients to pull live specs
./dev gen all
```

This runs: `events` → `clients` → `docs`.


# Correct Documentation Drift (/docs/guides/correct-documentation-drift)


# Correct Documentation Drift [#correct-documentation-drift]

## TL;DR [#tldr]

Do not fix drift by editing the first wrong-looking page. First classify the disagreement, identify the source of truth, decide whether the current system or the documented intent is correct, then update every affected surface in one change.

## Drift types [#drift-types]

| Drift type                   | Example                                                 | Default source of truth                         |
| ---------------------------- | ------------------------------------------------------- | ----------------------------------------------- |
| Docs vs code                 | Docs say Next.js 15; package metadata says Next.js 16.  | Code and package metadata                       |
| Docs vs generated contract   | Guide names an endpoint missing from OpenAPI.           | OpenAPI or AsyncAPI source                      |
| Docs vs skill                | Skill duplicates old architecture guidance.             | Docs for knowledge; skill for execution         |
| Data flow vs implementation  | TDD says Core publishes an event that code never emits. | Active delivery decision, then code/tests       |
| Diagram vs topology          | Architecture diagram omits a service boundary.          | Code, deployment config, specs, traces          |
| ADR vs current docs          | Principle page contradicts an accepted ADR.             | ADR until superseded                            |
| Active bet vs delivered code | TDD intent differs from implementation.                 | Product decision: fix code or revise active TDD |
| Runbook vs operations        | Runbook references a retired dashboard.                 | Current operational tooling                     |

## Workflow [#workflow]

### 1. Capture the mismatch [#1-capture-the-mismatch]

Write down the two or more conflicting claims. Be concrete:

* Page or file path.
* Claim text or diagram element.
* Source that contradicts it.
* Date or commit where the contradiction appeared, if known.

Avoid vague reports such as "docs are stale." They are not actionable.

### 2. Classify the surfaces [#2-classify-the-surfaces]

Mark each surface as one of:

* **Generated reference** — contracts, schemas, CLI tables, error catalogues.
* **Runtime source** — code, tests, migrations, deployment config, traces.
* **Active guidance** — principles, service handbooks, runbooks, active TDD docs.
* **Historical record** — accepted ADRs, delivered bets, incident records.
* **Agent execution** — skills and skill evals.

### 3. Identify the source of truth [#3-identify-the-source-of-truth]

Use this order unless the page states a stricter rule:

1. Generated contracts and schemas define public interfaces.
2. Code, migrations, deployment config, and tests define shipped behaviour.
3. Accepted ADRs define historical decisions until superseded.
4. Active bet and TDD docs define current delivery intent before shipping.
5. Principle and service handbook pages define durable guidance.
6. Skills define agent execution behaviour, not durable engineering knowledge.

### 4. Decide whether to fix code or docs [#4-decide-whether-to-fix-code-or-docs]

A mismatch does not always mean the docs are wrong. Ask:

* Did code drift away from an intentional design?
* Did the design change but docs were not updated?
* Did a generated reference fail to regenerate?
* Did a skill preserve old policy after docs changed?
* Did an ADR get superseded without a new ADR?

If the documented design is still correct, fix code or create a delivery task. If shipped behaviour is correct, update active docs and skill references.

### 5. Update all affected surfaces [#5-update-all-affected-surfaces]

A complete drift correction may need changes to:

* Docs page content and `last_reviewed` metadata.
* Diagrams and data-flow descriptions.
* OpenAPI or AsyncAPI specs.
* Code, tests, migrations, or deployment config.
* ADRs when the decision changed.
* Skill context routing and verification steps.
* `llms.txt`, `llms-full.txt`, and Markdown exports.
* Skill-to-doc map entries.

### 6. Add a regression guard [#6-add-a-regression-guard]

Choose the cheapest guard that would have caught the drift:

* Health check for version strings, missing frontmatter, or broken links.
* Contract generation check for API/event reference drift.
* Diagram drift check for service-topology claims.
* Test or trace assertion for runtime flow claims.
* Skill eval for agent behaviour drift.
* Review-cadence change for pages that stale quickly.

### 7. Verify [#7-verify]

Run the relevant commands:

```bash
./dev docs health
cd services/wordloop-docs && pnpm run docs:health
```

Run service tests or generation commands when code, contracts, or generated docs changed.

## Data-flow and design-doc drift [#data-flow-and-design-doc-drift]

Data-flow drift is high risk because it misleads implementation and agent planning. Treat these checks as mandatory for active bets and service handbooks:

* Every service boundary in a data-flow diagram has a contract or explicit TODO.
* Every persistent object in a TDD has a schema plan or a reason it is transient.
* Every event shown in a diagram appears in AsyncAPI or is marked proposed.
* Every API operation shown in a guide appears in OpenAPI or is marked proposed.
* Every failure path that crosses a service boundary has an owner and response strategy.
* Every implementation milestone updates active TDD docs when it changes the design.

## Hallucination controls [#hallucination-controls]

Use these controls when correcting drift with AI assistance:

* Ask the agent to cite local source files, specs, or docs slugs for factual claims.
* Prefer generated contracts and package metadata over prose memory.
* Do not accept newly invented standard names, endpoints, event names, or commands without checking the source.
* Require exact paths for changed files and exact commands for verification.
* Search the repository before introducing new terminology.
* Treat external web claims as untrusted until verified against an official source.

## Anti-patterns [#anti-patterns]

* **Patch one page and stop.** Drift is usually cross-surface.
* **Refresh dates without review.** A new date on stale claims is worse than an old date.
* **Rewrite history.** Supersede ADRs and annotate delivered bets instead.
* **Trust AI recall.** Use source files, contracts, and official references.
* **Leave no regression guard.** If the drift was expensive, add a check.

## Related [#related]

* [Documentation Freshness](/docs/operations/documentation-freshness)
* [Keep Docs and Skills in Sync](/docs/guides/keep-docs-and-skills-in-sync)
* [Documentation](/docs/principles/foundations/documentation)


# Deploy (/docs/guides/deploy)


# Deploy [#deploy]

## Goal [#goal]

Take a merged change from `main` and see it running for all users, with a verified canary step in between.

## Prerequisites [#prerequisites]

* Change merged to `main` (we deploy from trunk — see [Progressive Delivery](/docs/principles/delivery/progressive-delivery)).
* Familiarity with the observability dashboard for the service being deployed.

## Steps [#steps]

### 1. CI triggers the deploy [#1-ci-triggers-the-deploy]

Every merge to `main` triggers the CI pipeline: run tests, build container image, push to Artifact Registry, deploy to Cloud Run canary.

### 2. Watch the canary [#2-watch-the-canary]

The canary serves a small fraction of traffic. The automated promotion gate compares canary SLO metrics — latency, error rate, user-journey success — against the current production.

Monitor the release dashboard; in most cases, automated promotion handles it. Manual override is available when you want to pause or abort.

### 3. Promote or abort [#3-promote-or-abort]

* **Automated promote.** If canary metrics are within tolerance for the watch window, traffic is shifted to 100%.
* **Automated abort.** If canary burn rate exceeds the threshold, traffic is routed back and the team is paged.
* **Manual promote.** For releases with user-facing changes, a human can promote or hold.

### 4. Close the release [#4-close-the-release]

Once promotion is complete, close the release ticket, announce in the release channel, and verify the user-facing change behaves as expected.

## Verification [#verification]

* Current traffic is 100% on the new revision.
* SLO dashboards are green.
* Feature flags for the new release (if any) are in the expected state.

## Troubleshooting [#troubleshooting]

* **Canary aborted.** Check the release dashboard for the failing signal. Common causes: a dependency change that increases latency, an environment variable missing in the new revision.
* **Deploy stuck "in progress."** Check Cloud Run logs for the service; a crash-loop will block promotion.
* **SLO burn after promotion.** Roll back via the dashboard; file the incident ticket.

See [Progressive Delivery](/docs/principles/delivery/progressive-delivery) for the broader stance and [Operations / Runbooks](/docs/operations/runbooks) for post-deploy recovery procedures.


# Guides (/docs/guides)


# Guides [#guides]

Guides are **task-oriented**: each one walks you through completing a specific goal, from first command to verification. They assume you already know roughly why you want to do the thing; if you do not, follow the links into [Learn](/docs/learn) or [Engineering Principles](/docs/principles) from inside the guide.

## Developer workflow [#developer-workflow]

<Cards>
  <Card title="Add an API Endpoint" href="/docs/guides/add-api-endpoint" description="From OpenAPI spec to generated handler to passing test." />

  <Card title="Add a Service" href="/docs/guides/add-service" description="Scaffold a new microservice that plugs into gateway, observability, and CI." />

  <Card title="Run Tests" href="/docs/guides/run-tests" description="Unit, integration, and system tests — how to run them locally and read their output." />

  <Card title="Migrate the Schema" href="/docs/guides/migrate-schema" description="Write, review, and apply a Postgres migration with zero downtime." />

  <Card title="Deploy" href="/docs/guides/deploy" description="Trigger a build, watch the canary, promote to production." />

  <Card title="Code Generation" href="/docs/guides/code-generation" description="Regenerate clients, types, and handlers from /specs." />
</Cards>

## How to read a guide [#how-to-read-a-guide]

Every guide is structured the same way: **Goal → Prerequisites → Steps → Verification → Troubleshooting**. If you find a step that fails in a way the guide does not cover, treat that as a bug in the documentation and open a PR against the guide itself — see [Your First Contribution](/docs/start/first-contribution).


# Keep Docs and Skills in Sync (/docs/guides/keep-docs-and-skills-in-sync)


# Keep Docs and Skills in Sync [#keep-docs-and-skills-in-sync]

## TL;DR [#tldr]

Docs hold durable engineering knowledge. Skills control agent execution. When either surface changes, update the skill-to-doc map, review the other surface, run documentation health checks, and evaluate any affected skill behaviour.

## When to use this workflow [#when-to-use-this-workflow]

Use this workflow when you:

* Change a principle, service handbook, workflow guide, runbook, or reference page that an agent skill may load.
* Create, edit, split, rename, or remove an agent skill.
* Move durable guidance from a skill into the docs site.
* Add a docs page that should become canonical context for an existing skill.
* Change skill trigger wording, safety gates, verification commands, or reference-loading instructions.

## Source-of-truth rule [#source-of-truth-rule]

| Content type                                 | Canonical home                         |
| -------------------------------------------- | -------------------------------------- |
| Durable architecture and engineering policy  | Docs site                              |
| Service-specific implementation conventions  | Docs site                              |
| API, event, schema, CLI, and error reference | Generated docs where possible          |
| Historical decisions                         | ADRs                                   |
| Active delivery intent                       | Active bet and TDD docs                |
| Skill triggering and task routing            | Skill frontmatter and SKILL.md         |
| Agent safety gates and verification workflow | Skill SKILL.md                         |
| Skill evaluation prompts and harness         | Skill workspace or skill-factory evals |

## Workflow: changing docs [#workflow-changing-docs]

1. **Identify affected skills.** Check the skill-to-doc map for skills that depend on the page.
2. **Update the docs page.** Keep the page human-readable and agent-readable. Do not write prompt-like instructions into human docs.
3. **Update freshness metadata.** Change `last_reviewed` only after checking the claims against the source of truth.
4. **Review affected skills.** Check whether the skill still points to the right page, loads the right context, and verifies the right behaviour.
5. **Update skill references if needed.** Keep the skill concise; point to docs instead of copying durable guidance.
6. **Run health checks.** Use `./dev docs health` from the platform root.
7. **Run skill evals when behaviour changed.** If trigger wording, routing, or safety gates changed, run representative skill prompts before merging.

## Workflow: changing skills [#workflow-changing-skills]

1. **Decide whether the change is knowledge or execution.** Move durable knowledge to docs. Keep execution behaviour in the skill.
2. **Update the source skill.** Edit `tools/skill-factory/skills/<skill>/` first; sync to `.agents/skills/` after review.
3. **Update the skill-to-doc map.** Add, remove, or rename canonical docs dependencies.
4. **Review mapped docs pages.** Confirm the docs still contain the knowledge the skill is expected to load.
5. **Create or update eval prompts.** Include should-trigger and should-not-trigger cases for trigger changes.
6. **Run health checks.** Confirm mapped docs pages and skill paths exist.
7. **Sync consumed skills.** Run `./dev sync skills` or copy the reviewed skill into `.agents/skills/` using the approved repository workflow.

## Skill-to-doc map rules [#skill-to-doc-map-rules]

Each maintained skill should declare:

* The skill name.
* The source skill path.
* The consumed skill path.
* Canonical docs dependencies by docs slug.
* Optional secondary docs used for specific task variants.
* The review owner.

The map is intentionally lightweight. It does not prove semantic correctness; it makes affected-surface review discoverable.

## Review checklist [#review-checklist]

* Does the skill still trigger for the right user prompts?
* Does the skill avoid triggering for adjacent but wrong prompts?
* Does the skill load canonical docs instead of duplicating them?
* Does the docs page avoid agent-only prompt language?
* Do docs, skills, code, generated specs, and ADRs agree on the source-of-truth hierarchy?
* Did `last_reviewed` change only after a real review?
* Did generated `llms-full.txt` and Markdown exports stay current?

## Anti-patterns [#anti-patterns]

* **Shadow policy in skills.** Durable rules copied into SKILL.md instead of linked to docs.
* **Prompt-shaped docs.** Human docs that read like system prompts.
* **Unmapped skills.** A skill that depends on docs but is invisible to health checks.
* **Blind freshness updates.** Changing `last_reviewed` without validating claims.
* **Eval-free trigger edits.** Changing trigger wording without testing realistic prompts.

## Related [#related]

* [Documentation](/docs/principles/foundations/documentation)
* [Documentation Freshness](/docs/operations/documentation-freshness)
* [Correct Documentation Drift](/docs/guides/correct-documentation-drift)
* [Docs are canonical knowledge and skills are the agent execution layer](/docs/decisions/0005-docs-canonical-skills-execution-layer)


# Migrate the Schema (/docs/guides/migrate-schema)


# Migrate the Schema [#migrate-the-schema]

## Goal [#goal]

Change the Postgres schema in a way that is safe for production: additive first, reversible, and non-blocking on hot tables.

## Prerequisites [#prerequisites]

* Familiarity with [Postgres](/docs/principles/stack/postgres) and [Data Engineering](/docs/principles/system-design/data-engineering) principles.
* Local stack running (`./dev start infra`) so you can test the migration against a real database.

## Steps [#steps]

### 1. Draft the migration [#1-draft-the-migration]

Migrations live under `services/wordloop-core/migrations/` (or the equivalent directory for the service that owns the schema). Name them by timestamp and intent: `20260419123000_add_loops_archived_at.up.sql`.

Write the `.up.sql` **additively**:

* Add columns as nullable, or with a default expression that is cheap on a hot table.
* Add new tables as empty.
* Never rename or drop in a single migration — split into "add new", "backfill", "stop reading old", "drop old" across releases.

Write the `.down.sql` as an exact reverse, tested locally.

### 2. Test locally [#2-test-locally]

```bash
./dev migrate up
./dev migrate down
./dev migrate up
```

Round-tripping catches broken `.down.sql` early.

### 3. Backfill in a separate job [#3-backfill-in-a-separate-job]

If the column needs a non-trivial value on historical rows, write a backfill job that chunks through the table and commits in batches. Do **not** backfill inside the migration itself — long-running DDL blocks replication and terrifies on-call engineers.

### 4. Coordinate with consumers [#4-coordinate-with-consumers]

If the schema change is part of a renaming or restructuring, the order of deploys matters:

* Deploy the code that reads both old and new columns.
* Run the migration.
* Backfill.
* Deploy the code that reads only the new column.
* In a later release, drop the old column.

### 5. Commit the migration and the code change together [#5-commit-the-migration-and-the-code-change-together]

The PR should include the migration and the code that uses it. Reviewers can see the full scope of the change.

## Verification [#verification]

* `./dev migrate status` shows the migration as applied.
* Service tests pass against the migrated schema.
* Rollback tested locally.
* [Database Reference](/docs/reference/database) regenerates cleanly.

## Troubleshooting [#troubleshooting]

* **`ALTER TABLE` is taking forever in staging.** If it is a large table with a `NOT NULL DEFAULT`, the DDL is rewriting every row. Split into "add nullable → backfill → tighten to NOT NULL."
* **`.down.sql` fails.** Down migrations often break when the up migration contains data transformations. Consider whether the down is genuinely needed; some migrations are forward-only (and the code has to be able to tolerate that).

See [Postgres](/docs/principles/stack/postgres) for the stance that shapes this workflow.


# Run Tests (/docs/guides/run-tests)


# Run Tests [#run-tests]

## Goal [#goal]

Run the right tests for the change you are making — unit, service, or system — and read the output in a way that makes failures actionable.

## Prerequisites [#prerequisites]

* Local stack bootstrapped (`./dev start infra`) so that Testcontainers has a working Docker daemon.
* Familiarity with [Testing](/docs/principles/foundations/testing) — especially the "favour service tests over unit tests" and "emulate, don't mock" disciplines.

## Steps [#steps]

### 1. Run per-service tests [#1-run-per-service-tests]

```bash
./dev test core        # Go service tests for wordloop-core
./dev test ml          # Python tests + evals for wordloop-ml
./dev test app         # Vitest + React Testing Library for wordloop-app
./dev test             # Everything
```

Service tests spin up real Postgres and Pub/Sub containers where needed.

### 2. Run system tests [#2-run-system-tests]

System tests exercise multiple services together through their real APIs and trace assertions.

```bash
./dev test system
```

These take longer; run them before opening a PR that touches multiple services.

### 3. Run with race detection (Go) [#3-run-with-race-detection-go]

```bash
./dev test core -- -race
```

Concurrency bugs are easier to find than to debug; run with `-race` on any change that touches goroutines.

### 4. Run ML evals [#4-run-ml-evals]

```bash
./dev test ml -- --evals
```

Runs the committed eval set. Regressions above the threshold fail the command.

## Verification [#verification]

* Exit code 0 on the targeted suites.
* Trace assertions pass (no missing spans).
* Coverage report (if enabled) shows the change is exercised.

## Troubleshooting [#troubleshooting]

* **"Cannot connect to Docker daemon."** Start Docker Desktop; verify with `./dev status`.
* **Testcontainers start slow.** First run pulls the Postgres image; subsequent runs use the cached image.
* **Flaky test.** Flakiness is a bug. File it; do not retry until green.

See [Testing](/docs/principles/foundations/testing) for the underlying stance.


# Learn the Platform (/docs/learn)


# Learn the Platform [#learn-the-platform]

This section is for understanding — the *why* and *how* behind Wordloop. It is not a tutorial (see [Start Here](/docs/start/quickstart)) and it is not a reference (see [Reference](/docs/reference)). It is the narrative layer that turns a repository of code into a system you can reason about.

## What you will find here [#what-you-will-find-here]

<Cards>
  <Card title="Concepts" href="/docs/learn/concepts" description="The vocabulary of Wordloop: Meetings, People, Transcriptions, Syntheses, Tasks, and the entities that carry them." />

  <Card title="Architecture" href="/docs/learn/architecture/overview" description="System topology, data flow, authentication, workflows, and the infrastructure under them — visualised with C4 diagrams." />

  <Card title="Services" href="/docs/learn/services/core" description="Per-service handbooks: wordloop-core (Go API), wordloop-ml (Python AI runtime), wordloop-app (Next.js frontend)." />
</Cards>

## How to read this section [#how-to-read-this-section]

Start with **Concepts** if the domain is new to you — understanding what a Meeting, Person, and MeetingSynthesis mean in code matters for every change downstream. Move to **Architecture** to see how services compose into a platform, then drop into a **Service** handbook when you need implementation-level depth.

If you want to know *what we believe* about building software at this scale and why, read [Engineering Principles](/docs/principles). If you want to *do something*, see [Guides](/docs/guides). If you want to *look something up*, see [Reference](/docs/reference).


# Documentation Freshness (/docs/operations/documentation-freshness)


# Documentation Freshness [#documentation-freshness]

## TL;DR [#tldr]

Every active documentation page needs an owner, a review cadence, and a visible freshness state. Stale docs are not automatically wrong, but they are lower-trust until reviewed. Historical records such as ADRs and delivered bets are handled differently: they are preserved, corrected with explicit notes when necessary, or superseded.

## Freshness states [#freshness-states]

| State          | Meaning                                                | Reader guidance                                                               |
| -------------- | ------------------------------------------------------ | ----------------------------------------------------------------------------- |
| **Fresh**      | `last_reviewed` is inside the review window.           | Treat as current unless code or contracts prove otherwise.                    |
| **Review due** | The review window has passed.                          | Use with caution; verify against source of truth before making major changes. |
| **Stale**      | The page is more than one review window overdue.       | Do not use as authoritative without checking code, specs, traces, or owners.  |
| **Generated**  | The page is produced from code, contracts, or schemas. | Regenerate from source instead of editing by hand.                            |
| **Historical** | The page records past intent or decisions.             | Preserve history; supersede or add correction notes instead of rewriting.     |

## Review windows [#review-windows]

| Surface             |                Default review window | Status model        | Source of truth                            |
| ------------------- | -----------------------------------: | ------------------- | ------------------------------------------ |
| Principles          |                             6 months | Active              | Docs and accepted ADRs                     |
| Service handbooks   |                             3 months | Active              | Code, package metadata, architecture docs  |
| How-to guides       |                             6 months | Active              | Commands, workflows, and tested paths      |
| Runbooks            |                             3 months | Active              | Operational reality and incident follow-up |
| API reference       |                Every contract change | Generated           | OpenAPI specs                              |
| Event reference     |                Every contract change | Generated           | AsyncAPI specs                             |
| Database reference  |                  Every schema change | Generated or active | Migrations and schema introspection        |
| Glossary            |                             6 months | Active              | Domain vocabulary and product language     |
| Active bet TDD docs | Every material implementation change | Active              | Current delivery intent and code reality   |
| Delivered bet docs  |                            No expiry | Historical          | Archived design record                     |
| ADRs                |                            No expiry | Historical          | Append-only decision record                |
| Agent skills        |    Every skill or mapped docs change | Active              | Skill source plus mapped docs pages        |

## Required frontmatter [#required-frontmatter]

Active authored pages should include:

```yaml
title: Documentation
description: One sentence describing the page.
audience: engineers
owner: docs-platform
last_reviewed: 2026-05-01
review_frequency: P6M
status: active
source_of_truth: docs
```

Generated pages should declare that they are generated where the generator supports it:

```yaml
status: generated
source_of_truth: specs/core-openapi.json
```

Historical pages should not be forced into an active freshness cycle:

```yaml
status: historical
source_of_truth: accepted-adr
```

## Review triggers [#review-triggers]

Review a page before its normal review window when one of these events happens:

* A package, language runtime, framework, or infrastructure version changes.
* A public command, environment variable, port, endpoint, event, or schema changes.
* A service boundary or data-flow diagram changes.
* A skill starts depending on the page for agent execution.
* An incident exposes missing or misleading operational guidance.
* An ADR supersedes a decision that the page explains.
* A user or agent reports confusion caused by the page.

## Stale-page handling [#stale-page-handling]

1. **Classify the page.** Decide whether it is active, generated, or historical.
2. **Find the source of truth.** Use code, specs, migrations, traces, ADRs, or active design docs depending on the claim.
3. **Update the page or mark it historical.** Do not silently keep stale active guidance.
4. **Update `last_reviewed`.** Only update the date after checking the claims, not after touching formatting.
5. **Run documentation health checks.** Confirm metadata, internal links, skill-doc references, and generated corpora are still valid.
6. **Review affected skills.** If a skill depends on the page, check whether the skill's routing or verification steps need to change.

## What not to do [#what-not-to-do]

* Do not refresh `last_reviewed` without reviewing the claims.
* Do not rewrite accepted ADRs to make them current.
* Do not edit generated reference pages by hand.
* Do not hide stale badges because they are inconvenient.
* Do not rely on humans to notice stale stack versions, command names, or broken links when a script can check them.

## Commands [#commands]

Run the health check from the platform root:

```bash
./dev docs health
```

Run the underlying docs script directly when working inside the docs service:

```bash
cd services/wordloop-docs
pnpm run docs:health
```

## Related [#related]

* [Documentation](/docs/principles/foundations/documentation)
* [Keep Docs and Skills in Sync](/docs/guides/keep-docs-and-skills-in-sync)
* [Correct Documentation Drift](/docs/guides/correct-documentation-drift)


# Operations (/docs/operations)


# Operations [#operations]

The Operations section is written for the person staring at a red graph at 3am — or the one who will, one day. It is different from [Guides](/docs/guides): guides walk you through a happy-path operation you *want* to perform; runbooks walk you through a degraded state you *have to* respond to.

## When to use this section [#when-to-use-this-section]

<Cards>
  <Card title="Troubleshooting" href="/docs/operations/troubleshooting" description="Common failure symptoms and how to localise them — service-by-service diagnostic trees." />

  <Card title="On-Call" href="/docs/operations/on-call" description="Rotation, escalation, incident-response protocol, and the tools an on-call engineer needs on hand." />

  <Card title="Runbooks" href="/docs/operations/runbooks" description="Step-by-step recovery procedures for known failure modes." />
</Cards>

## Writing for 3am [#writing-for-3am]

Operational documentation has a harsh audience: a stressed engineer under time pressure. The bar is high.

* **State the goal at the top.** Every runbook begins with "This runbook restores *X* when *Y*."
* **Number the steps.** Imperative sentences. Exact commands, exact flags, exact expected output.
* **Include rollback.** Every step that changes state must explain how to undo it.
* **Link to observability.** Every step that checks state must link to the dashboard that proves it.
* **Close with escalation.** If the runbook fails, who or what is next?

See [Engineering Principles / Reliability](/docs/principles/quality/reliability) for why we hold this bar.


# On-Call (/docs/operations/on-call)


# On-Call [#on-call]

On-call is the contract we sign with our users: if the platform breaks, someone is responsible for putting it back together, and that someone is paged promptly. This page describes how the rotation is structured, how incidents are handled, and the tools an on-call engineer should have open before their shift starts.

## Rotation [#rotation]

Primary and secondary on-call shifts run in one-week blocks. The calendar is maintained in our paging system; pages route to the current primary with automatic escalation to the secondary if unacknowledged.

## Before your shift [#before-your-shift]

1. **Skim the last two weeks of incidents.** Patterns recur — knowing the last time this alert fired is usually the fastest lead.
2. **Confirm paging works.** Send yourself a test page; verify the escalation chain.
3. **Verify dashboard access.** Observability dashboards, feature-flag console, deploy dashboard, Cloud Run console, database console.
4. **Review recent deploys.** A page five minutes after a deploy is almost certainly about the deploy.

## When you are paged [#when-you-are-paged]

1. **Acknowledge within 5 minutes.** Even if you are not ready to act, acknowledge stops escalation.
2. **Open the incident channel.** The paging system creates one automatically; post your initial assessment there.
3. **Localise, don't rebuild.** Use [Troubleshooting](/docs/operations/troubleshooting) to find the matching diagnostic tree. Do not write new code in an incident unless necessary.
4. **Apply the relevant [runbook](/docs/operations/runbooks).** If none exists, write one during the postmortem.
5. **Escalate when stuck.** 30 minutes without progress is the soft threshold. Call the secondary; call the service owner; call the service leader.

## Communication [#communication]

The incident channel is the record. Post:

* What you saw (the symptom).
* What you checked (the diagnostic path).
* What you did (the mitigation).
* Who else is involved.

One line every few minutes is better than radio silence. Other engineers read the channel to decide whether to jump in; absence of updates reads as "this is handled" when it may not be.

## After the incident [#after-the-incident]

* **Close the page.** Confirm the alert is cleared.
* **Open a postmortem ticket.** Use the blameless postmortem template; name the specific reliability assumption that was invalidated.
* **File action items.** One concrete, closable ticket per action. "Be more careful" is not an action item.
* **Update the runbook.** If the runbook missed a step, fix it while the experience is fresh.

## Tools every on-call engineer should have ready [#tools-every-on-call-engineer-should-have-ready]

* Observability dashboards, pinned per service.
* Deploy dashboard with rollback on hand.
* Feature-flag console with write access.
* Cloud Run console with per-service revision access.
* Database console (read-only by default; write access only on demand, with an audit trail).
* The team's runbook index.

## Related [#related]

* [Reliability](/docs/principles/quality/reliability) — the SLO and error-budget model that shapes what gets paged.
* [Troubleshooting](/docs/operations/troubleshooting) — diagnostic trees for common symptoms.
* [Runbooks](/docs/operations/runbooks) — step-by-step recovery procedures.


# Troubleshooting (/docs/operations/troubleshooting)


# Troubleshooting [#troubleshooting]

This page is for the "something feels off" moment, before you know which runbook to follow. It is a set of diagnostic trees — start from the symptom you can see, follow the branch that narrows the cause, then consult the matching [runbook](/docs/operations/runbooks) or escalate.

## Symptom: the frontend is blank after sign-in [#symptom-the-frontend-is-blank-after-sign-in]

1. **Check the browser console.** Look for 401/403 from `wordloop-core` → Clerk token issue. Look for 5xx → backend issue.
2. **Check the Core service health.** Hit `/healthz` on Core. If it responds, the backend is up; the problem is in auth or in the specific call the app makes first.
3. **Check JWT verification logs** on Core for the incoming request. A mismatch between the Clerk environment and the Core configuration will produce "token signature does not verify" here.

## Symptom: transcription lag is spiking [#symptom-transcription-lag-is-spiking]

1. **Check the ML service trace**. Filter for `transcribe.turn` spans with latency > SLO. If the model call itself is slow, the model provider or network is the cause.
2. **Check the model-client adapter logs.** Rate-limit responses from the provider surface here.
3. **Check the audio queue depth**. If the queue is deep, consumers are not keeping up — scale the ML workers or investigate a backpressure signal.

## Symptom: WebSocket connections drop repeatedly [#symptom-websocket-connections-drop-repeatedly]

1. **Check the gateway logs** for timeout errors — that usually indicates a platform-layer idle timeout below our expected session length.
2. **Check the client reconnect pattern.** A flood of reconnects from one client suggests a client-side bug; a broader pattern suggests a server-side issue.
3. **Check for `BACKPRESSURE_SHED` error frames.** If clients are being shed, the server is overloaded — check the SLO dashboard.

## Symptom: deploys are failing in CI [#symptom-deploys-are-failing-in-ci]

1. **Check the CI logs for the failing step.** Most failures are one of: tests broke, image build broke, vulnerability scan flagged a dependency.
2. **If tests broke,** run them locally (`./dev test <service>`) — a flaky test should be fixed, not retried.
3. **If the image build broke,** often due to Dockerfile layer changes or base-image updates. The CI log shows the layer.
4. **If the vulnerability scan flagged,** the dependency audit is doing its job. Upgrade the dependency or add a justified waiver.

## When to move to a runbook [#when-to-move-to-a-runbook]

If you have localised the symptom to a known failure mode (database slow, cache cold, model provider degraded, Pub/Sub backed up), move to the corresponding [runbook](/docs/operations/runbooks) for the recovery procedure.

## When to escalate [#when-to-escalate]

* Symptom is user-visible and you cannot localise it within 10 minutes.
* Symptom involves suspected security or privacy breach — escalate immediately ([Security](/docs/principles/quality/security), [Privacy](/docs/principles/quality/privacy)).
* Symptom is a novel failure mode not covered by any runbook. Document it in the postmortem for future detection.

See [On-Call](/docs/operations/on-call) for the escalation tree.


# Engineering Manifesto (/docs/principles)


# Engineering Manifesto [#engineering-manifesto]

Software engineering is the discipline of managing complexity and optimising for change. Wordloop is a platform that processes high-volume asynchronous workloads and serves clients in real time at scale — so we lean hard on a solid technical foundation, frictionless developer velocity, and a rigorous engineering culture.

> \[!IMPORTANT]
> These principles are the shared vocabulary we use to decide what to build, how to build it, and what trade-offs we accept. Every page in this hub stands on its own and does not require context from any other document to be useful.

The hub serves three audiences equally: engineers new to Wordloop learning how we think, experienced engineers returning for a stance on a specific domain, and AI agents working on a Wordloop task.

## What we believe [#what-we-believe]

1. **Complexity is the enemy; clarity is the goal.** We choose simple designs, simple tools, and simple processes — and we accept the cost of doing so. Speculative abstraction, premature generalisation, and fear of deletion all compound into the kind of complexity that slows teams down.
2. **Contracts are the single source of truth.** API specifications, event schemas, and database definitions are authoritative. Clients, tests, documentation, and UIs are derived from them. When a spec is wrong, everything downstream is wrong — and that is the correct failure mode, because one visible error beats silent drift across hand-maintained artefacts.
3. **Reliability is designed in, not patched in.** We build for failure from the first commit: idempotency at the API boundary, graceful degradation at the edges, backpressure when downstream systems slow, and observability as a design-time concern rather than an afterthought.
4. **We test the system, not the mock of the system.** Tests that run against real databases, real message brokers, and real HTTP stacks catch the bugs that mocked tests hide. Emulation beats mocking wherever the dependency can run in a container.
5. **Hexagonal architecture is how we structure services.** Ports and adapters, with dependencies flowing inward toward the domain. The predictable file topology is as valuable for the humans reading the code as it is for the agents writing it.
6. **Documentation is a product, not a by-product.** This site is versioned, reviewed, and shipped with the same discipline as code. It serves humans and AI agents, and the structures that help one help the other.
7. **Architectural decisions are append-only.** We record trade-offs as they are made, model them as debt (principal + interest + multiplier), and preserve the history. Re-litigating a past decision without a new decision record is how teams lose their memory.
8. **AI agents are first-class engineers.** They read our docs, write our code, review our diffs, and run our tooling. We design our codebase, our conventions, and this documentation so an agent can operate at the same level of quality as a senior engineer.

## How to read this hub [#how-to-read-this-hub]

Start with the principle closest to your current task. Every page follows the same shape: a short statement of our stance, the industry context that makes it matter, the concrete principles we follow, and the anti-patterns we explicitly reject.

* **[Testing](/docs/principles/foundations/testing)** — How we guarantee reliability with Continuous Risk Assurance: service tests over unit tests, high-fidelity emulation, observability-driven development, and risk-based coverage.

More principle pages are being added as the hub expands to cover foundations, system design, our stack, quality, delivery, and AI-native development. Each new page is self-contained and lands on its own merits.


# CLI Reference (/docs/reference/cli)


# CLI Reference [#cli-reference]

The WordLoop platform has fully deprecated legacy Makefiles in favor of a bespoke, shell-native `./dev` interface that powers all local execution logic safely and predictively.

All targets are run from the monorepo root. Run `./dev help` for a formatted list.

## Lifecycle [#lifecycle]

| Command                              | Description                                                                       |
| ------------------------------------ | --------------------------------------------------------------------------------- |
| `./dev start all`                    | Start infra (Docker) + Core, ML, App, Docs (native)                               |
| `./dev start all --docker`           | Start everything in Docker containers                                             |
| `./dev start infra`                  | Start shared infra only (Postgres, Pub/Sub, Storage, OTel)                        |
| `./dev start [services...]`          | Start specific services natively (e.g. `./dev start core ml`)                     |
| `./dev start [services...] --docker` | Start specific services in Docker containers                                      |
| `./dev stop all`                     | Stop everything safely (Docker + native processes)                                |
| `./dev stop wipe`                    | Destructive: stop everything and destroy all data volumes                         |
| `./dev stop [services...]`           | Stop specific services (auto-detects native vs Docker)                            |
| `./dev logs all`                     | Tail logs for all running services                                                |
| `./dev logs [services...]`           | Tail logs for specific services — supports multi-tail (e.g. `./dev logs core ml`) |
| `./dev attach db`                    | Drop into an interactive psql shell                                               |
| `./dev status`                       | Print local environment ports and endpoints                                       |

<Callout type="info">
  Services run **natively** by default with auto-reload (Air for Go, uvicorn for Python, HMR for Next.js). Use `--docker` to opt into Docker containers when needed.
</Callout>

## Quality [#quality]

| Command             | Description                                                   |
| ------------------- | ------------------------------------------------------------- |
| `./dev test all`    | Execute all testing suites across all packages                |
| `./dev test system` | Execute strictly end-to-end integration boundaries via Pytest |
| `./dev test smoke`  | Run infrastructure health smoke tests                         |
| `./dev test core`   | Run Go test suites                                            |
| `./dev test ml`     | Run Python Pytest suites                                      |
| `./dev test app`    | Run TS Vitest suites                                          |
| `./dev lint all`    | Run static analysis across all services                       |
| `./dev lint core`   | Run `go vet` on Core                                          |
| `./dev lint ml`     | Run `ruff check` on ML                                        |
| `./dev lint app`    | Run `eslint` on App                                           |

## Utilities [#utilities]

| Command               | Description                                       |
| --------------------- | ------------------------------------------------- |
| `./dev db migrate`    | Apply all pending Core DB migrations              |
| `./dev db rollback`   | Revert the single most recently applied migration |
| `./dev db drop`       | Destructive: completely drop the schema           |
| `./dev db shell`      | Drop securely into the local PostgreSQL console   |
| `./dev dash obs`      | Open the .NET Aspire Observability UI Dashboard   |
| `./dev dash api`      | Open the ML API Swagger docs                      |
| `./dev dash app`      | Open the Next.js App                              |
| `./dev dash docs`     | Open the Fumadocs Documentation UI                |
| `./dev gcp pubsub`    | Interact with local Pub/Sub emulator via gcloud   |
| `./dev gcp storage`   | Query the local Storage emulator REST API         |
| `./dev gen api`       | Generate OpenAPI schemas                          |
| `./dev gen events`    | Generate AsyncAPI structs across all services     |
| `./dev gen clients`   | Rebuild typed API clients (Orval + Go + Python)   |
| `./dev gen docs`      | Recompile OpenAPI metadata for docs UI            |
| `./dev setup env`     | Copy environment baseline configurations          |
| `./dev setup install` | Install workspace-wide package dependencies       |

## System [#system]

| Command                  | Description                                                                       |
| ------------------------ | --------------------------------------------------------------------------------- |
| `./dev doctor`           | Validate all system dependencies, Docker status, port availability, and env files |
| `./dev completions zsh`  | Output zsh auto-completion script                                                 |
| `./dev completions bash` | Output bash auto-completion script                                                |

<Callout type="tip">
  **First time?** Run `./dev doctor` immediately after cloning to verify your machine has everything needed.
</Callout>

### Enabling auto-completion [#enabling-auto-completion]

```bash
# Zsh — add to ~/.zshrc for permanent access
eval "$(./dev completions zsh)"

# Bash — add to ~/.bashrc
eval "$(./dev completions bash)"
```

After sourcing, typing `./dev ` then pressing Tab will suggest available commands and sub-targets.

## Native vs Docker [#native-vs-docker]

By default, `./dev start core` runs the Go service natively using Air for auto-reload. This means:

* **File changes are detected automatically** — Air watches `.go` files and rebuilds in \~1 second
* **Migrations run on every restart** — database schema is always current
* **Logs go to `.dev/logs/`** — tail them with `./dev logs core`
* **IDE debugging works** — you can also run Core from your IDE's debugger instead

Use `--docker` when you need full containerized behavior (e.g., testing Dockerfiles, CI parity, or running without Go installed locally).

## Debug Environments [#debug-environments]

By running selectively (e.g., `./dev start infra core`), you intentionally leave services like `wordloop-ml` turned off. This allows you to run those specific services through your IDE (like VSCode Launch actions) so you get full debugging breakpoint control while depending on a containerized or native backend.

## Resilience Model [#resilience-model]

The CLI is designed for safety and resilience:

* **Graceful shutdown**: `./dev stop` sends `SIGTERM` first, allowing services to flush connections and clean up. Only falls back to `SIGKILL` after a 3-second grace period.
* **Subshell isolation**: All commands run in isolated subshells, preventing `cd` side-effects from corrupting your terminal's working directory.
* **Port conflict detection**: `./dev doctor` and `./dev start` both check for port conflicts before launching services.
* **No external dependencies**: Port checking uses native bash `/dev/tcp` instead of requiring `nc` or `netcat`.


# Configuration (/docs/reference/configuration)


# Configuration [#configuration]

Every service in the Wordloop platform loads its configuration from environment variables, following the [Twelve-Factor App](https://12factor.net/) config principle. This page is the canonical catalogue of those variables — what they do, what their defaults are, and which service owns them.

<Callout type="info">
  Local defaults are generated by `./dev setup env`. The variables listed here are the full contract; your local `.env` files typically override only the subset you need.
</Callout>

## Common variables [#common-variables]

Variables consumed by multiple services.

| Variable                      | Service(s) | Default (local)         | Purpose                                                                                                      |
| ----------------------------- | ---------- | ----------------------- | ------------------------------------------------------------------------------------------------------------ |
| `APP_ENV`                     | all        | `development`           | `development`, `test`, `staging`, `production`. Controls auth mode, logging verbosity, and feature defaults. |
| `DATABASE_URL`                | core       | derived                 | Postgres connection string.                                                                                  |
| `PUBSUB_EMULATOR_HOST`        | core, ml   | `localhost:8085`        | Local Pub/Sub emulator. Unset in production.                                                                 |
| `OTEL_EXPORTER_OTLP_ENDPOINT` | all        | `http://localhost:4318` | Collector endpoint for traces, metrics, and logs.                                                            |
| `LOG_LEVEL`                   | all        | `info`                  | `debug`, `info`, `warn`, `error`.                                                                            |

## `wordloop-core` [#wordloop-core]

| Variable                | Default                | Purpose                                  |
| ----------------------- | ---------------------- | ---------------------------------------- |
| `CORE_PORT`             | `4002`                 | HTTP + WebSocket port.                   |
| `CLERK_SECRET_KEY`      | —                      | Backend Clerk key for JWT verification.  |
| `CLERK_PUBLISHABLE_KEY` | —                      | Frontend-shared key; surfaced for debug. |
| `STORAGE_BUCKET`        | `wordloop-local-audio` | GCS bucket for audio artefacts.          |

## `wordloop-ml` [#wordloop-ml]

| Variable               | Default     | Purpose                                       |
| ---------------------- | ----------- | --------------------------------------------- |
| `ML_PORT`              | `4003`      | FastAPI port.                                 |
| `MODEL_PROVIDER`       | `anthropic` | Chooses which model adapter to load.          |
| `ANTHROPIC_API_KEY`    | —           | Set when `MODEL_PROVIDER=anthropic`.          |
| `OPENAI_API_KEY`       | —           | Set when `MODEL_PROVIDER=openai`.             |
| `ML_CACHE_TTL_SECONDS` | `3600`      | Cache lifetime for deterministic model calls. |

## `wordloop-app` [#wordloop-app]

| Variable                            | Default                 | Purpose                             |
| ----------------------------------- | ----------------------- | ----------------------------------- |
| `NEXT_PUBLIC_CORE_URL`              | `http://localhost:4002` | URL the browser uses to reach Core. |
| `NEXT_PUBLIC_CLERK_PUBLISHABLE_KEY` | —                       | Clerk frontend key.                 |
| `APP_PORT`                          | `4001`                  | Next.js port.                       |

## Feature flags [#feature-flags]

Feature flags are served dynamically — they are not environment variables. See the flag dashboard for the current state and owners. Progressive-delivery principles ([Progressive Delivery](/docs/principles/delivery/progressive-delivery)) govern how flags are created, rolled, and retired.

## Further reading [#further-reading]

* [Quickstart](/docs/start/quickstart) — bootstrapping local `.env` files.
* [Security](/docs/principles/quality/security) — the rules around secret handling.
* [Twelve-Factor App](https://12factor.net/) — the philosophy behind environment-based config.


# Database Schema (/docs/reference/database)


# Database Schema [#database-schema]

The Postgres database is owned exclusively by `wordloop-core`.

<Callout type="warn" title="Important">
  Schema changes must be managed through versioned SQL migrations in `services/wordloop-core/scripts/migrations/`. Do not apply manual schema alterations.
</Callout>

## ER diagram [#er-diagram]

<Mermaid
  chart="`erDiagram
  users ||--o{ meetings : owns
  users ||--o{ tasks : owns
  users ||--o{ notes : owns
  users ||--o{ ai_threads : owns
  users ||--o| people : person_id

  meetings ||--o{ meeting_attendees : has
  meetings ||--o{ transcriptions : has
  transcriptions ||--o{ transcript_segments : has
  meetings ||--o{ tasks : source
  meetings ||--o{ meeting_audio_files : has

  people ||--o{ meeting_attendees : attends
  people ||--o{ transcript_segments : speaks
  people ||--o{ tasks : assigned_to

  ai_threads ||--o{ chat_messages : contains
`"
/>

## Tables [#tables]

### `users` [#users]

Primary user account linked to Auth0.

| Column       | Type             | Notes              |
| ------------ | ---------------- | ------------------ |
| `id`         | UUID PK          |                    |
| `auth0_id`   | TEXT UNIQUE      | External identity  |
| `email`      | TEXT             |                    |
| `name`       | TEXT             |                    |
| `person_id`  | UUID FK → people | Optional self-link |
| `created_at` | TIMESTAMPTZ      |                    |

### `people` [#people]

Contacts and meeting participants.

| Column               | Type        | Notes                              |
| -------------------- | ----------- | ---------------------------------- |
| `id`                 | UUID PK     |                                    |
| `name`               | TEXT        |                                    |
| `role`               | TEXT        | Job title / role description       |
| `email`              | TEXT        |                                    |
| `company`            | TEXT        |                                    |
| `tags`               | JSONB       |                                    |
| `voice_model_status` | TEXT        | `untrained` / `training` / `ready` |
| `voice_confidence`   | DECIMAL     |                                    |
| `voice_vector`       | vector(512) | Optional SpeechBrain embedding     |

### `meetings` [#meetings]

Recorded or uploaded conversations.

| Column        | Type            | Notes                                         |
| ------------- | --------------- | --------------------------------------------- |
| `id`          | UUID PK         |                                               |
| `user_id`     | UUID FK → users |                                               |
| `title`       | TEXT            |                                               |
| `start_time`  | TIMESTAMPTZ     |                                               |
| `end_time`    | TIMESTAMPTZ     |                                               |
| `summary`     | TEXT            | AI-generated summary                          |
| `key_points`  | JSONB           |                                               |
| `source_type` | TEXT            | `recording` / `upload` / `text` / `anecdotal` |
| `created_at`  | TIMESTAMPTZ     |                                               |

### `meeting_audio_files` [#meeting_audio_files]

Audio files attached to meetings.

| Column         | Type               | Notes     |
| -------------- | ------------------ | --------- |
| `id`           | UUID PK            |           |
| `meeting_id`   | UUID FK → meetings |           |
| `storage_path` | TEXT               | GCS path  |
| `file_name`    | TEXT               |           |
| `content_type` | TEXT               | MIME type |
| `file_size`    | BIGINT             | Bytes     |
| `created_at`   | TIMESTAMPTZ        |           |

### `transcriptions` [#transcriptions]

Records the transcription job details connected to a meeting.

| Column           | Type               | Notes                                                                                      |
| ---------------- | ------------------ | ------------------------------------------------------------------------------------------ |
| `id`             | UUID PK            |                                                                                            |
| `meeting_id`     | UUID FK → meetings |                                                                                            |
| `status`         | TEXT               | enum: `pending`, `transcribing`, `diarizing`, `extracting_features`, `completed`, `failed` |
| `status_message` | TEXT               | Optional error details                                                                     |
| `created_at`     | TIMESTAMPTZ        |                                                                                            |
| `updated_at`     | TIMESTAMPTZ        |                                                                                            |

### `transcription_status_history` [#transcription_status_history]

Audit log of transcription status changes.

| Column             | Type                     | Notes |
| ------------------ | ------------------------ | ----- |
| `id`               | UUID PK                  |       |
| `transcription_id` | UUID FK → transcriptions |       |
| `status`           | TEXT                     |       |
| `status_message`   | TEXT                     |       |
| `created_at`       | TIMESTAMPTZ              |       |

### `transcript_segments` [#transcript_segments]

Timestamped chunks of transcribed speech.

| Column             | Type                     | Notes                                         |
| ------------------ | ------------------------ | --------------------------------------------- |
| `id`               | UUID PK                  |                                               |
| `transcription_id` | UUID FK → transcriptions |                                               |
| `person_id`        | UUID FK → people         | Nullable                                      |
| `speaker_label`    | TEXT                     | Temporary label before identification         |
| `text`             | TEXT                     |                                               |
| `start_time`       | DECIMAL                  | Seconds from start                            |
| `end_time`         | DECIMAL                  | Seconds from start                            |
| `confidence`       | DECIMAL                  | Transcription confidence                      |
| `is_final`         | BOOLEAN                  | Indicates if segment is finalized (streaming) |
| `feature_vector`   | vector(512)              | SpeechBrain embedding                         |

### `tasks` (formerly `action_items`) [#tasks-formerly-action_items]

Actionable items extracted from meetings.

| Column        | Type               | Notes                   |
| ------------- | ------------------ | ----------------------- |
| `id`          | UUID PK            |                         |
| `user_id`     | UUID FK → users    |                         |
| `content`     | TEXT               |                         |
| `status`      | TEXT               | `pending` / `completed` |
| `due_date`    | DATE               |                         |
| `assigned_to` | UUID FK → people   |                         |
| `meeting_id`  | UUID FK → meetings |                         |
| `sub_tasks`   | JSONB              |                         |
| `created_at`  | TIMESTAMPTZ        |                         |

### `notes` [#notes]

Free-form notes attached to people or meetings.

| Column         | Type            | Notes                |
| -------------- | --------------- | -------------------- |
| `id`           | UUID PK         |                      |
| `user_id`      | UUID FK → users |                      |
| `content`      | TEXT            |                      |
| `subject_type` | TEXT            | `PERSON` / `MEETING` |
| `subject_id`   | UUID            | Polymorphic FK       |
| `tags`         | JSONB           |                      |
| `created_at`   | TIMESTAMPTZ     |                      |
| `updated_at`   | TIMESTAMPTZ     |                      |

### `ai_threads` [#ai_threads]

Contextual AI conversation containers.

| Column         | Type            | Notes                |
| -------------- | --------------- | -------------------- |
| `id`           | UUID PK         |                      |
| `user_id`      | UUID FK → users |                      |
| `context_type` | TEXT            | `PERSON` / `MEETING` |
| `context_id`   | UUID            | Polymorphic FK       |
| `created_at`   | TIMESTAMPTZ     |                      |

### `chat_messages` [#chat_messages]

Individual messages within an AI thread.

| Column       | Type                  | Notes                                    |
| ------------ | --------------------- | ---------------------------------------- |
| `id`         | UUID PK               |                                          |
| `thread_id`  | UUID FK → ai\_threads |                                          |
| `role`       | TEXT                  | `user` / `assistant` / `system` / `tool` |
| `content`    | TEXT                  |                                          |
| `tool_calls` | JSONB                 |                                          |
| `created_at` | TIMESTAMPTZ           |                                          |

## Migration history [#migration-history]

<Callout type="info">
  Migrations are applied via `./dev db migrate` and live in `services/wordloop-core/scripts/migrations/`.
</Callout>

| Version          | Description                                                        |
| ---------------- | ------------------------------------------------------------------ |
| `20250709123530` | Initial schema (users, people)                                     |
| `20260309152000` | Meetings, transcripts, tasks, notes, AI threads                    |
| `20260313204400` | Add `person_id` to users                                           |
| `20260315213000` | Rename `action_items` → `tasks`                                    |
| `20260324204621` | Add `meeting_audio_files`                                          |
| `20260324211500` | Add `meeting.status`                                               |
| `20260326090621` | Add `meeting.status_message`                                       |
| `20260327200316` | Update transcript segment fields                                   |
| `20260329060000` | Add `is_final` to transcript\_segments                             |
| `20260329204000` | Add `meeting_status_history` (later dropped)                       |
| `20260330203000` | Add pgvector extension, `transcriptions` table, and `voice_vector` |


# Errors (/docs/reference/errors)


# Errors [#errors]

The Wordloop Core API follows RFC 9457 (`application/problem+json`) for error responses. Every error carries a `status` (HTTP code), a `title` (short stable description), and an optional `detail` string with context. Clients and AI agents should branch on `status` and `title` — these are stable and never renumbered.

## Envelope [#envelope]

All error responses follow this shape:

```json
{
  "status": 404,
  "title": "Not Found",
  "detail": "No meeting with the provided id exists.",
  "instance": "/meetings/abc123"
}
```

Validation errors include an `errors` array of field-level diagnostics:

```json
{
  "status": 400,
  "title": "Unprocessable Entity",
  "detail": "Request body did not match the schema.",
  "errors": [
    { "message": "required", "path": "body.title", "value": "" }
  ]
}
```

## Common HTTP status codes [#common-http-status-codes]

| Status | Title                 | Meaning                                                           | Action                                        |
| ------ | --------------------- | ----------------------------------------------------------------- | --------------------------------------------- |
| 401    | Unauthorized          | The request lacked a valid Clerk token or session.                | Re-authenticate; refresh token.               |
| 403    | Forbidden             | The caller is authenticated but not authorised for this resource. | Confirm role and scope. Do not retry.         |
| 404    | Not Found             | The resource does not exist.                                      | Verify the identifier; check user visibility. |
| 400    | Unprocessable Entity  | The request body did not match the schema.                        | Inspect `errors` for field-level diagnostics. |
| 409    | Conflict              | An `Idempotency-Key` was reused with a different payload.         | Generate a fresh key; retry.                  |
| 429    | Too Many Requests     | Per-caller rate limit exceeded.                                   | Back off per `Retry-After`.                   |
| 504    | Gateway Timeout       | A downstream dependency timed out.                                | Retry with exponential backoff.               |
| 500    | Internal Server Error | Unexpected server error; details captured in our observability.   | Retry with backoff; escalate if sustained.    |

## WebSocket error frames [#websocket-error-frames]

Real-time errors use a custom envelope on the wire:

```json
{
  "type": "error",
  "error": {
    "code": "SESSION_EXPIRED",
    "message": "Session token expired; reconnect with a fresh one.",
    "details": { "session_id": "sess_..." }
  }
}
```

| Code                | Meaning                                                                                        | Action                                             |
| ------------------- | ---------------------------------------------------------------------------------------------- | -------------------------------------------------- |
| `SESSION_EXPIRED`   | The WebSocket session token is no longer valid.                                                | Fetch a new token; reconnect.                      |
| `RESUME_FAILED`     | The server could not resume the session at the supplied sequence.                              | Reconnect without a resume token; rehydrate state. |
| `BACKPRESSURE_SHED` | Informational: the server dropped a low-priority message because the client could not keep up. | No client action required.                         |

## Further reading [#further-reading]

* [API Design](/docs/principles/system-design/api-design) — the stance on structured errors.
* [Agent-Native Systems](/docs/principles/ai-native/agent-native-systems) — why stable codes matter for agent consumers.
* [Core API Reference](/docs/reference/api/core) — per-endpoint error catalogues rendered from the OpenAPI spec.


# Glossary (/docs/reference/glossary)


# Glossary [#glossary]

The authoritative vocabulary of Wordloop. When code, docs, or conversation refers to one of these terms, this page is what the term means. The domain-level concepts also appear in [Learn / Concepts](/docs/learn/concepts), where they are explained with more narrative context.

## A [#a]

**ADR — Architecture Decision Record.** An append-only document capturing a significant, hard-to-reverse decision, with explicit debt annotations. See [Decisions](/docs/decisions).

**Adapter.** A component that implements a [port](#p), bridging the domain to an external dependency (database, message broker, HTTP framework). Part of the [hexagonal](#h) architecture vocabulary.

**AsyncAPI.** The machine-readable specification format we use to document event streams — the asynchronous counterpart to OpenAPI. See [Core Events Reference](/docs/reference/events/core-ws).

## B [#b]

**Backpressure.** The explicit control signal by which a producer is slowed when a consumer cannot keep up. In Wordloop, backpressure is designed into every real-time flow — we shed, coalesce, or block rather than buffering unbounded. See [Real-Time](/docs/principles/system-design/real-time).

## C [#c]

**Canary.** A release shape where a small fraction of traffic reaches a new revision before it is promoted to 100%. See [Progressive Delivery](/docs/principles/delivery/progressive-delivery).

**Clerk.** The third-party authentication provider we use for user identity. JWTs from Clerk are verified by `wordloop-core` on every request.

**Core (wordloop-core).** The Go HTTP and WebSocket API that is the source of truth for Meetings, People, Transcriptions, Tasks, and real-time session state.

## D [#d]

**DORA metrics.** Deployment frequency, lead time for changes, change failure rate, and mean time to recover — the four research-backed metrics we use to measure delivery performance. See [DevEx](/docs/principles/delivery/devex).

## E [#e]

**Error budget.** The quantity of "bad" events allowed by an SLO over a rolling window. Consumed by outages; restored by uptime. See [Reliability](/docs/principles/quality/reliability).

**Eval.** A scored comparison of model output against a reference. We run evals in CI to catch regressions in AI-driven behaviour. See [AI Engineering](/docs/principles/ai-native/ai-engineering).

## H [#h]

**Hexagonal architecture.** The structural pattern — domain core, ports, adapters — that every non-trivial Wordloop service follows. See [Hexagonal Architecture](/docs/principles/system-design/hexagonal-architecture).

## I [#i]

**Idempotency key.** A client-supplied identifier that lets the server recognise and safely handle retried writes. Every write endpoint in Wordloop accepts one. See [API Design](/docs/principles/system-design/api-design).

**IDP — Internal Developer Platform.** The set of shared tooling, runtimes, and golden paths engineers use to build on Wordloop. See [Platform](/docs/principles/delivery/platform).

## J [#j]

**JIT — Just-in-Time provisioning.** The pattern by which Wordloop creates a local User and Person record the first time a user signs in via Clerk. No webhooks, no seeding. See [Quickstart](/docs/start/quickstart).

## L [#l]

**llms.txt.** The machine-readable index of a documentation site, consumed by AI agents. See [/llms.txt](/llms.txt) and [Agent-Native Systems](/docs/principles/ai-native/agent-native-systems).

## M [#m]

**MCP — Model Context Protocol.** The interoperable protocol we use to expose tools and resources to AI agents. See [Agent-Native Systems](/docs/principles/ai-native/agent-native-systems).

**Meeting.** The primary entity in Wordloop — a bounded session that is captured in the system, attended by People, and producing a Transcription, a MeetingSynthesis, and Tasks. The `meetings` table and `/meetings` routes are the centre of the domain.

**MeetingSynthesis.** The AI-generated summary attached to a Meeting. Contains a headline, prose summary, key points, Topics, and TalkingPoints. Produced by the ML service after the Transcription finalises.

**ML (wordloop-ml).** The Python FastAPI runtime responsible for transcription, synthesis generation, and embedding.

## N [#n]

**Note.** A free-form annotation attached to any entity via a polymorphic `subject_type` / `subject_id` pair.

## O [#o]

**OpenAPI.** The machine-readable specification format we use to document HTTP APIs. Our server handlers and clients are generated from it. See [API Design](/docs/principles/system-design/api-design).

**OTel — OpenTelemetry.** The vendor-neutral observability framework we use for traces, metrics, and logs. See [Observability](/docs/principles/quality/observability).

**Outbox pattern.** The transactional pattern by which a database write and an event emission are committed together, via an `outbox` table. See [Integration Patterns](/docs/principles/system-design/integration-patterns).

## P [#p]

**Person.** A contact record representing someone who appeared in a Meeting, with or without a Wordloop account. Carries identity fields and an optional voice model for speaker attribution.

**pgvector.** The Postgres extension we use as our production vector store. See [Postgres](/docs/principles/stack/postgres).

**Port.** An interface declared by the domain describing a capability it needs, implemented by an [adapter](#a). Part of the [hexagonal](#h) architecture vocabulary.

## R [#r]

**RAG — Retrieval-Augmented Generation.** The pattern of enriching a model call with retrieved context from our own data. See [AI Engineering](/docs/principles/ai-native/ai-engineering).

**Runbook.** A step-by-step recovery procedure for a known failure mode. See [Operations / Runbooks](/docs/operations/runbooks).

## S [#s]

**SLO — Service Level Objective.** A per-journey target for latency and success rate, measured over a rolling window. The foundation of [Reliability](/docs/principles/quality/reliability).

## T [#t]

**Tag.** A user-defined label applied to Meetings, People, or Tasks for organisation.

**TalkingPoint.** A specific point or claim within a Topic, surfaced as a bullet in the MeetingSynthesis view.

**Task.** An action item extracted from a Meeting. Tasks are assignable, hierarchical, and tracked through to completion. Statuses: `pending`, `in_progress`, `completed`, `canceled`.

**Topic.** A thematic cluster extracted from a Meeting's TranscriptSegments, carrying a name, summary, and the contributing segments.

**Transcription.** The speech-to-text record attached to a Meeting, aggregating TranscriptSegments as they arrive from the ML service.

**TranscriptSegment.** The atomic unit of a Transcription — one speaker turn, carrying speaker label, attributed Person, text, timestamps, confidence score, and a `is_final` flag.

## U [#u]

**User.** A Wordloop account holder, identified via Clerk and JIT-provisioned on first sign-in. Each User has an associated Person record.

## V [#v]

**Voice model.** The speaker-identification vector attached to a Person, built from verified TranscriptSegments and used to attribute future segments to a specific Person.

## W [#w]

**WebSocket.** The default transport for real-time streams in Wordloop. See [Real-Time](/docs/principles/system-design/real-time).


# Reference (/docs/reference)


# Reference [#reference]

Reference material is **information-oriented**: terse, complete, and predictable. If you are returning to Wordloop after a break and need to remember the exact `./dev` flag, the JSON shape of an event, or what error code 4003 means — you are in the right section.

## Contract surfaces [#contract-surfaces]

<Cards>
  <Card title="CLI" href="/docs/reference/cli" description="Every ./dev command in one table, with flags, defaults, and examples." />

  <Card title="API" href="/docs/reference/api/core" description="Core and ML HTTP APIs — OpenAPI-rendered, with live examples." />

  <Card title="Events" href="/docs/reference/events/core-ws" description="WebSocket and Pub/Sub event schemas from our AsyncAPI specs." />

  <Card title="Database" href="/docs/reference/database" description="Schema, tables, columns, indexes, and foreign keys." />

  <Card title="Configuration" href="/docs/reference/configuration" description="Every environment variable, config key, and feature flag, organised by service." />

  <Card title="Errors" href="/docs/reference/errors" description="Error codes with HTTP status, meaning, and operator action." />

  <Card title="Glossary" href="/docs/reference/glossary" description="The canonical definition for every domain term and acronym." />
</Cards>

## A note on sources of truth [#a-note-on-sources-of-truth]

Wherever possible, reference pages are **generated from the same specs the code is generated from**. The API reference is rendered directly from `specs/*-openapi.json`, the events reference from `specs/*-asyncapi.yaml`, and the database schema from the live migrations. If a reference page seems to drift from reality, the spec is the canonical source — open an issue, then fix the spec, and the page will follow.


# Your First Contribution (/docs/start/first-contribution)


# Your First Contribution [#your-first-contribution]

You have the platform running locally and you have read the relevant principle pages. Your first change is a chance to learn the tooling and the review culture, not to design a system. Pick something bounded.

## Good candidates for a first PR [#good-candidates-for-a-first-pr]

* **Fix a typo or broken link in the docs.** The docs site lives in `services/wordloop-docs`. Edit an MDX file, rebuild locally, open a PR.
* **Add a missing Vale word** to our style dictionary when linting flags a legitimate term. This teaches you the quality-governance workflow.
* **Tighten an existing test** against a real behaviour it does not yet cover. Real bugs fall out of this kind of reading; fabricating new features does not.
* **Improve a runbook** after you follow it and find a step that is unclear.

## The mechanics [#the-mechanics]

1. **Branch.** `git checkout -b your-name/short-description` from `main`. Keep names short and descriptive.
2. **Make the change.** Small and focused. If you find yourself fixing two things at once, split into two PRs.
3. **Run the relevant tests.** Use `./dev test <service>` for unit tests and `./dev test system` for cross-service integration. See [Run Tests](/docs/guides/run-tests).
4. **Lint your work.** `./dev lint all` covers Go, Python, and TypeScript; for docs changes run Vale if configured.
5. **Commit.** Write a commit message that explains *why* the change is needed, not just what changed. Our history is a long-term artefact.
6. **Open a PR.** Include a description of the change, the reasoning, a test plan, and links to any related issues or decision records.
7. **Respond to review.** Reviewers may push back on naming, structure, or scope. Treat review comments as invitations to improve the change, not as attacks.
8. **Merge when green.** CI must pass; a reviewer must approve.

## After merge [#after-merge]

Watch the deploy. Our [CI/CD pipeline](/docs/learn/architecture/infrastructure) builds a Docker image, pushes it to Artifact Registry, and deploys to Cloud Run. If anything breaks in production, the on-call engineer will page — you may be asked to revert quickly. That is normal; it means the feedback loop is working.

## What to read next [#what-to-read-next]

* [Guides](/docs/guides) — task-oriented how-tos for the common operations.
* [Engineering Principles](/docs/principles) — the stance behind the code you are about to touch.
* [Reference / CLI](/docs/reference/cli) — every `./dev` command in one table.


# Getting Started (/docs/start/quickstart)


# Getting Started [#getting-started]

## The `./dev` CLI Driver [#the-dev-cli-driver]

All local orchestration, testing, database migrations, and telemetry dashboards are driven exclusively by the custom `./dev` CLI tool located in the repository root. See the [CLI Reference](/docs/reference/cli) to get started!

## Prerequisites [#prerequisites]

| Tool                                          | Version        | Purpose                                                       |
| --------------------------------------------- | -------------- | ------------------------------------------------------------- |
| [Docker](https://docs.docker.com/get-docker/) | Compose v2.20+ | Infrastructure services                                       |
| [Go](https://go.dev/)                         | 1.25+          | wordloop-core                                                 |
| [Air](https://github.com/air-verse/air)       | latest         | Go auto-reload (`go install github.com/air-verse/air@latest`) |
| [uv](https://github.com/astral-sh/uv)         | latest         | wordloop-ml Python env                                        |
| [pnpm](https://pnpm.io/)                      | latest         | wordloop-app dependencies                                     |
| [ffmpeg](https://ffmpeg.org/)                 | latest         | ML audio processing                                           |

{/* LLM-Context: TL;DR:
  This guide is a "Day Zero" guided walkthrough. It moves beyond raw commands mapping out 
  how developers should use `./dev start infra` to bootstrap their local environment, 
  and attach IDE debuggers (like VSCode / GoLand) specifically for service debugging.
  */}

## The "Day Zero" Guided Walkthrough [#the-day-zero-guided-walkthrough]

Welcome to Wordloop! Instead of throwing a wall of terminal commands at you, this guide walks you through setting up your environment for an optimal local development experience, including hooking up your IDE debuggers.

### Step 1: Environment Checks [#step-1-environment-checks]

Before starting, validate your local toolchain:

```bash
# Assumes you have cloned the repo and are at the root
./dev doctor
```

If `doctor` flags any missing dependencies (like Docker, Go, or Node) or occupied ports, follow its provided instructions to resolve them.

### Step 2: Bootstrapping Config & Secrets [#step-2-bootstrapping-config--secrets]

Generate and configure your local environment files:

```bash
./dev setup env
```

This scaffolds `.env` and `.env.local` files across the monorepo.

* **wordloop-ml:** Edit to add ML/AI API keys.
* **wordloop-app & core:** Add your Clerk frontend & backend keys for authentication.

### Step 3: Install Package Dependencies [#step-3-install-package-dependencies]

```bash
./dev setup install
```

### Step 4: Infrastructure & IDE Debugging [#step-4-infrastructure--ide-debugging]

We use a Hybrid Development Model. Infrastructure (Postgres, PubSub, etc.) runs statically in Docker, allowing you to run your target application natively in your IDE.

If you are working on the App Frontend but want to run Core natively so you can step through Go code:

1. **Start the dependencies in the background:**
   ```bash
   ./dev start infra ml app
   ```
   *This starts the DB, Pub/Sub, the ML service, and the Next.js frontend.*

2. **Launch the Core service in your IDE:**
   * **VSCode:** Open the debug panel and run the "Launch Core API" configuration.
   * **GoLand:** Run the `cmd/server/main.go` file with Debug context.
     Now, any frontend requests will hit your breakpoints in the Core API.

If you just want to run everything locally without IDE debugging (e.g., verifying a PR):

```bash
./dev start all
```

## The Hybrid Development Model [#the-hybrid-development-model]

Infrastructure runs in Docker (stable, rarely changes). Application services run natively for instant feedback:

| What runs                       | Where             | Auto-reload?                    |
| ------------------------------- | ----------------- | ------------------------------- |
| Postgres, PubSub, Storage, OTel | Docker containers | n/a                             |
| Core API (Go)                   | Native via Air    | ✅ Rebuilds on `.go` file change |
| ML API (Python)                 | Native via uv     | ✅ Restarts on `.py` file change |
| App (Next.js)                   | Native via pnpm   | ✅ HMR in browser                |

### Typical workflows [#typical-workflows]

```bash
# Full stack (recommended for daily work)
./dev start all

# Infrastructure only (run services from your IDE)
./dev start infra

# Infrastructure + specific services
./dev start infra core        # Debug ML from IDE
./dev start infra core ml     # Debug App from IDE

# Force Docker containers (for integration testing)
./dev start core ml --docker
```

### Tailing logs [#tailing-logs]

Native service logs are written to `.dev/logs/` and can be tailed with the same CLI:

```bash
./dev logs core       # Tail Core output
./dev logs ml         # Tail ML output
./dev logs core ml    # Multi-tail Core + ML simultaneously
./dev logs all        # Tail everything (Docker)
```

## Full stack in Docker [#full-stack-in-docker]

For CI-like environments or full-stack integration testing:

```bash
./dev start all --docker   # Everything in containers
./dev logs all             # Tail all logs
./dev stop all             # Stop everything
```

## Authentication [#authentication]

Authentication is handled automatically through **JIT provisioning**:

1. Sign in via Clerk (Google, email, or test accounts) in the browser
2. The Core API verifies the Clerk JWT
3. If the user doesn't exist locally yet, they're auto-created from the Clerk API
4. No webhook tunnels, no manual tokens, no database seeding

<Callout type="info">
  System tests use a separate `APP_ENV=test` mode with raw UUID tokens. See [Testing](/docs/principles/foundations/testing) for details.
</Callout>

## Linting [#linting]

```bash
./dev lint        # Lints core (go vet), ml (ruff), and app (eslint)
./dev lint core   # Go linter only
./dev lint ml     # Python Ruff only
./dev lint app    # TypeScript ESLint only
```

## Checking status [#checking-status]

```bash
./dev status      # Show nicely formatted dashboard of running services
```

See [CLI Reference](/docs/reference/cli) for the complete target list.


# How We Work (/docs/work)


# How We Work [#how-we-work]

This section describes how we move from an observed problem to shipped customer value. The process is lean by design, enforcing that technical execution is strictly bound to clear intent and verified by automated tests from the very beginning. It answers the fundamental question: &#x2A;How do you move fast without skipping the discovery that stops you building the wrong thing?*

Work flows through four stages, each more concrete than the last:

<Mermaid
  chart="`graph TD
  PS[&#x22;1. Problem Statement&#x22;]
  P[&#x22;2. Pitch&#x22;]
  TDD[&#x22;3. TDD&#x22;]
  
  subgraph TDD_Phase [TDD Architecture]
    direction TB
    J[&#x22;UI Design & Data Flows&#x22;]
    C[&#x22;Contracts & Schemas&#x22;]
    M[&#x22;Milestones (User Visible Value)&#x22;]
    S[&#x22;Domain Slices&#x22;]
    
    J --> C
    C --> M
    M --> S
  end

  PS --> P
  P --> TDD
  TDD -.-> TDD_Phase
  TDD_Phase --> DONE[&#x22;✓ Delivered (Archive + Measure)&#x22;]`"
/>

***

# Inside each stage [#inside-each-stage]

## 1. Problem Statement [#1-problem-statement]

Most wasted work is caused by excellent execution of the wrong thing. Skipping from idea to solution — without pausing to understand the problem — leads to building with false confidence.

A **Problem Statement** captures observed pain — real, evidenced, specific — alongside an **appetite**: a judgment about how much time this problem is worth solving.

* **Appetite** is not an estimate of how long a solution will take. It is an opportunity cost judgment made *before* the solution is defined. You are betting that the problem is worth that much time.
* Problem statements do not accumulate indefinitely. They are a curated list, updated as understanding evolves and retired when no longer relevant.
* **Platform and infrastructure problems are valid problem statements.** The "who experiences it" can be internal — the engineering team, the system's reliability, the business's compliance posture. Feature bets routinely surface infrastructure gaps (e.g., a missing event backplane, no deletion cascade). The right response is to extract the gap as its own problem statement — not to expand the feature bet. The feature bet declares the constraint explicitly; the platform bet solves it.

## 2. Pitch [#2-pitch]

Unformed ideas become backlogs. Backlogs create the illusion that everything is captured and considered, when really they are lists of things nobody explicitly said no to.

Before a problem reaches the build phase, it is shaped into a **Pitch**. A pitch links a validated problem to a rough solution proposal. It is concrete enough to execute against but stays away from micro-detail.

A pitch must contain:

* **The problem** — what was observed, who experiences it, why it matters now.
* **The appetite** — how much time to spend.
* **A rough solution sketch** — the general approach to the solution.
* **Rabbit holes** — approaches already considered and ruled out. Include plausible-looking approaches that would blow the appetite or the scope, and infrastructure assumptions the bet makes (e.g., "we assume sticky sessions, not a backplane").
* **Explicit no-gos** — what is completely out of scope. Include both obvious exclusions *and* natural extensions that users would reasonably expect but that don't belong in this version (e.g., pause/resume, mobile support, export/download). Vague no-gos invite scope creep — be specific about what's excluded and why.

A funded pitch becomes a **Bet** — a commitment bounded by the appetite.

## 3. TDD: Foundations [#3-tdd-foundations]

Technical Design bridges the intent of the pitch to parallel execution. We start by laying the technical foundation so that progress isn't blocked later by misaligned interfaces.

### UI Design [#ui-design]

The **UI Design** doc translates the pitch's rough solution sketch into concrete, screen-level detail. It answers: &#x2A;what exactly will the user see and do?*

Organise it by screen — not by feature, not by user story. Each screen the bet touches gets its own section with:

* **A wireframe** — even a rough sketch. This is the anchor; the text describes it.
* **Layout** — the regions on screen and what content lives in each one.
* **States** — what the user sees during loading, active use, empty states, errors, and degraded conditions.
* **Key interactions** — what the user can do and what happens in response.

After the screens, map the **user journeys** between them (how the user moves from entry point to final outcome), and list **edge cases** (anything unusual the system needs to handle visibly).

**Be specific about data objects.** If a screen shows tasks, define what a task is: which fields it has, which are required, whether they nest, what states they can be in. If a screen has a text editor, say whether it's rich text or plain, whether it auto-saves or has a save button. These details directly determine the API contracts and database schema that come next.

**Stay at the user level.** If you're specifying which service owns the logic, how the frontend integrates, or where data persists — you've gone too far. The UI Design doc describes what the user experiences, not how the system delivers it. System concerns belong in the Data Flow doc.

**The output feeds directly into:** Data Flow diagrams (which service calls which), API contracts (what fields and endpoints exist), and database schemas (what gets stored). If someone can't design those artefacts from the UI Design doc alone, the doc isn't detailed enough.

### Data Flows [#data-flows]

The **Data Flow** doc maps every user interaction from the UI Design through service boundaries. It answers: &#x2A;what calls what, what data crosses each boundary, and what happens when something fails.* It is the primary input for API contracts and database schemas — if someone can't design those artefacts from this doc alone, the doc isn't complete.

**Start with a system context graph.** Before drawing any sequences, draw the topology: which services exist, which protocols connect them, which data stores each service owns. This is a static map — it orients readers and makes the scope of the bet explicit.

**Name flows after what triggers them.** Group related flows into logical Parts (e.g., "Session Lifecycle", "Streaming Processing", "Failure Modes"). A flow name describes what the user does or what system condition fires — not the implementation.

**Use descriptive operation labels — never endpoint paths.** Diagram labels should read like `Create task (idempotent, echo-suppressed)` not `POST /meetings/{id}/tasks`. Header names, field names, and HTTP methods all belong in the Contracts doc. Each arrow in a flow is a **contract boundary** (what shape the data takes) and a **sequencing constraint** (downstream cannot build until upstream is agreed). Naming the operation is enough — the Contracts doc defines the shape precisely.

**Failure modes are required, not optional.** For every significant service boundary in the bet, there must be at least one flow describing what happens when that boundary fails. If the UI Design doc models a "Degraded" or "Connectivity Lost" state, the Data Flow doc must show the recovery sequence. Resilience is a first-class design concern — not an afterthought.

**Close with two required sections:**

* **Design Decisions** — tradeoffs made, alternatives ruled out, constraints that drove choices. Captures reasoning that isn't visible in the diagrams.
* **Boundary Inventory** — a table of every service-to-service boundary in the doc. Five columns: Boundary | Flows | From → To | Protocol | Data shape. Each row here becomes a contract entry in the Contracts doc.

### Contracts & Schemas [#contracts--schemas]

The agreed API contracts (REST, WebSocket, Pub/Sub) and database schemas (PostgreSQL, object storage). Downstream UI can mock against the contract; upstream Core can build against it.

## 4. TDD: Execution [#4-tdd-execution]

Once the technical foundation is set, the bet is decomposed into deliverable units.

* **Integration Milestones:** Points of user-visible value. This is the integration of multiple pieces that results in a cohesive feature or state change for the user.
* **Domain Slices:** The smallest independently buildable and testable units of work. We **never** slice horizontally (e.g. building all databases, then all APIs, then all UI). We always slice vertically. A vertical slice could be a full feature connecting App -> Core -> ML, or a complete vertical slice completely within a single domain (e.g., being able to do CRUD on a Meeting in Core via the API). Slices must be independently deployable and verifiable.

### Tests as Proof of Delivery [#tests-as-proof-of-delivery]

Every milestone and slice has its test overview properly documented, and corresponding **empty test stubs are generated in the test runner** before any production code is written.

These tests serve as the **single source of truth** for progress signaling. Red means work to do; green means proven.

1. **Service/system tests** (permanent) — implemented in the service repo or `tests/system/` during the build.
2. **Bet progress suite** (temporary) — mirrored in `tests/bets/<slug>/`, run on demand via `./dev test bet <slug>`.

***

## Bet Operations [#bet-operations]

By utilizing the Golden Path CLI tools, documentation is kept exactly in sync with the integration testing layout.

### Start a new bet [#start-a-new-bet]

```bash
./dev new bet <slug>
```

Promotes a pitch into an active bet at `work/<slug>/` and creates the baseline test boundary suite in `tests/bets/<slug>/`. The slug must be lowercase kebab-case (e.g. `speaker-navigation`). A pitch must exist first — run `./dev new pitch <slug>`.

### Scaffolding Architecture (TDD) [#scaffolding-architecture-tdd]

```bash
# Scaffolds architectural boundaries
./dev new contract <bet-slug> <service> <protocol>
./dev new schema <bet-slug> <service> <database-tech>

# Scaffolds milestones and domains
./dev new milestone <bet-slug> <milestone-slug>
./dev new slice <bet-slug> <milestone-slug> <domain> <slice-slug>
```

Generating a `slice` or a `milestone` will drop corresponding placeholder testing boundaries in `tests/bets/`.

### Run bet progress tests [#run-bet-progress-tests]

```bash
./dev test bet <slug>
```

Runs the bet progress suite on demand. Watch the test output to verify that your delivery is progressing as intended.

### Archive a delivered bet [#archive-a-delivered-bet]

```bash
./dev archive bet <slug>
```

Moves the bet directory to `_archive/` and the associated test suite to `tests/bets/_archive/<slug>/`. URL routing is preserved.


# Authentication & Authorization (/docs/learn/architecture/auth)


# Authentication & Authorization [#authentication--authorization]

Wordloop delegates absolute identity management to **Clerk** while retaining local user schemas strictly to anchor database relations.

<Callout type="warn" title="Important">
  Internal services rely on symmetric tokens for system-level trust. Zero-trust principles apply at external boundaries; inherited trust applies internally.
</Callout>

## User Authentication Flow (Clerk) [#user-authentication-flow-clerk]

Clerk acts as our authoritative identity provider (IdP).

<Mermaid
  chart="`sequenceDiagram
  participant User
  participant App as &#x22;App (Next.js)&#x22;
  participant Clerk
  participant Core as &#x22;Core (Go)&#x22;

  User->>App: Login
  App->>Clerk: Authenticate
  Clerk-->>App: Return JWT Session Token
  App->>Core: Request with HTTP Bearer Token
  Core->>Clerk: Fetch JWKS & Verify
  Core-->>App: Return Data
`"
/>

### Frontend Implementation [#frontend-implementation]

* **Identity Context:** `wordloop-app` uses `@clerk/nextjs` for all auth flows.
* **Header Injection:** JWT tokens are automatically injected into `wordloop-core` requests as `Authorization: Bearer <token>` by the Orval API clients via a custom fetch interceptor.

### Backend Validation [#backend-validation]

* **Middleware:** `wordloop-core` uses robust Clerk middleware within the Huma framework.
* **Verification:** The middleware validates the JWT symmetrically against Clerk's JWKS endpoint, extracting the `clerk_user_id` directly into the Request `context.Context`.

## Data Synchronization [#data-synchronization]

To link auth identities with core business entities (like Meetings or Transcripts), users are synchronized into the local Postgres database.

<Callout type="info">
  Database synchronization occurs asynchronously via Clerk Webhooks.
</Callout>

1. **User Creation:** When a user registers, Clerk fires a `user.created` webhook to `wordloop-core`.
2. **Database Sink:** Core validates the Svix headers, parses the webhook payload, and idempotently upserts the record into the `users` table.

## Service-to-Service Authentication [#service-to-service-authentication]

When internal services communicate outside of standard user contexts (e.g., the ML engine pulling an audio binary from Core API endpoints), they use a static symmetric token.

* **Header Specification:** `Authorization: Bearer <SERVICE_AUTH_TOKEN>`
* **Assumed Scope:** Full administrative access.

<Callout type="warn">
  **Never expose the `SERVICE_AUTH_TOKEN` to the frontend or public-facing API routes.** This token bypasses user validation logic.
</Callout>


# Optimistic Mutation with Echo-Suppressed Streaming (/docs/learn/architecture/data-flow)


# Optimistic Mutation with Echo-Suppressed Streaming [#optimistic-mutation-with-echo-suppressed-streaming]

This is Wordloop's core data architecture for all user-initiated CRUD operations. The pattern separates **writes** (REST) from **reads** (WebSocket) to achieve perceived zero-latency mutations with real-time multi-device synchronization.

<Callout type="info" title="Canonical Reference">
  This pattern governs all entity-level operations — notes, tasks, topics, meeting metadata, and any future entity types. Audio streaming and ML-generated events use different pipelines documented in [System Workflows](/docs/learn/architecture/system-workflows).
</Callout>

## Why This Design [#why-this-design]

Traditional request/response flows force the user to wait for the server round-trip before seeing results. Polling-based updates miss state changes between intervals. Full event sourcing introduces operational complexity that isn't justified for Wordloop's entity CRUD workloads.

This pattern sits in the pragmatic middle:

| Concern               | Approach                                                                                                             |
| --------------------- | -------------------------------------------------------------------------------------------------------------------- |
| **Write path**        | REST — transactional, idempotent, familiar error handling. The server is the single source of truth.                 |
| **Read path**         | WebSocket — server pushes complete entity payloads on every state change. No polling, no stale cache windows.        |
| **Perceived latency** | Optimistic updates — the client applies the change locally before the REST response. The UI responds in under 16ms.  |
| **Multi-device sync** | All connected clients for a user receive every state change via WebSocket. No refresh required.                      |
| **Echo prevention**   | Source-aware events — the originating client ignores its own echo by matching the `clientId` on the WebSocket event. |

***

## The Five-Step Data Loop [#the-five-step-data-loop]

Every mutation follows this exact sequence:

<Mermaid
  chart="`sequenceDiagram
  participant C1 as &#x22;Client A (Originator)&#x22;
  participant Core as &#x22;wordloop-core&#x22;
  participant DB as &#x22;Postgres&#x22;
  participant WS as &#x22;WebSocket Hub&#x22;
  participant C2 as &#x22;Client B (Other Device)&#x22;

  Note over C1: 1. Optimistic Update
  C1->>C1: Apply change to local UI immediately
  C1->>C1: Store rollback snapshot

  Note over C1,Core: 2. REST Mutation
  C1->>Core: POST /notes (X-Client-Id: abc-123)
  Core->>DB: INSERT note RETURNING *
  DB-->>Core: Row with server-assigned ID + timestamps

  Note over Core,WS: 3. Event Broadcast
  Core->>WS: Emit note.created (sourceClientId: abc-123)
  Core-->>C1: 201 Created (full entity)

  Note over C1: 4. Echo Suppression
  WS-->>C1: note.created (sourceClientId: abc-123)
  C1->>C1: sourceClientId matches own → discard

  Note over C2: 5. Cross-Device Sync
  WS-->>C2: note.created (sourceClientId: abc-123)
  C2->>C2: sourceClientId differs → apply to UI
`"
/>

***

## Step-by-Step Breakdown [#step-by-step-breakdown]

### Step 1 — Optimistic Update [#step-1--optimistic-update]

When a user performs an action (add note, edit title, delete task), the client applies the change to local state **immediately**, before the network request fires. Three things happen:

1. **The change is applied to the UI.** The user sees the result instantly.
2. **A rollback snapshot is stored.** If the server rejects the mutation, the client reverts to this snapshot.
3. **A pending indicator is shown.** Optimistic entities render with a subtle visual cue (reduced opacity, syncing badge, or a small spinner) so the user understands the change is not yet confirmed. The indicator is removed when the REST response arrives.

For entity **creation**, the client generates a temporary ID (a UUID prefixed with `temp_`) so the new entity can appear in the UI and be referenced before the server assigns a permanent ID.

### Step 2 — REST Mutation [#step-2--rest-mutation]

The mutation is sent to the appropriate REST endpoint with two critical headers:

```http
POST /api/v1/notes HTTP/1.1
Authorization: Bearer <jwt>
X-Client-Id: abc-123
Content-Type: application/json

{
  "meetingId": "mtg_01J...",
  "content": "Follow up with the design team"
}
```

| Header          | Purpose                                                                                                                |
| --------------- | ---------------------------------------------------------------------------------------------------------------------- |
| `Authorization` | User identity (JWT from Clerk). Determines **who** is performing the action.                                           |
| `X-Client-Id`   | Client instance identity. Determines **which device/tab** initiated the action. Used exclusively for echo suppression. |

The REST response returns the **complete server-authoritative entity** — including the server-assigned `id`, `createdAt`, `updatedAt`, and `version` fields. The client uses this response to replace its temporary optimistic state with the confirmed server state.

### Step 3 — Event Broadcast [#step-3--event-broadcast]

After the database write succeeds, Core publishes a WebSocket event to **all connected clients** within the event's scope. The event uses the CloudEvents envelope and carries the full entity payload:

```json
{
  "specversion": "1.0",
  "type": "note.created",
  "source": "wordloop-core",
  "id": "evt_01J...",
  "data": {
    "id": "note_01J...",
    "meetingId": "mtg_01J...",
    "content": "Follow up with the design team",
    "createdAt": "2026-04-17T20:00:00Z",
    "updatedAt": "2026-04-17T20:00:00Z",
    "version": 1
  },
  "sourceClientId": "abc-123"
}
```

<Callout type="info" title="Complete Payloads, Not Diffs">
  Events carry the full entity state, not a delta. This keeps client logic simple — the receiving client replaces its local copy of the entity directly without applying patch operations or maintaining a change log. The trade-off is larger payloads, which is acceptable for Wordloop's entity sizes.
</Callout>

### Step 4 — Echo Suppression [#step-4--echo-suppression]

The originating client receives the WebSocket event and compares `sourceClientId` against its own client ID:

```
Incoming event sourceClientId: "abc-123"
My clientId:                    "abc-123"
→ Match. Discard event (UI already reflects this from the optimistic update).
```

Without echo suppression, the originating client would render the change twice — once from the optimistic update and once from the WebSocket event — causing visual flicker and duplicate list entries.

### Step 5 — Cross-Device Sync [#step-5--cross-device-sync]

Other clients connected for the same user receive the identical WebSocket event. Since their `clientId` does not match the `sourceClientId`, they apply the entity payload directly to their local UI state:

```
Incoming event sourceClientId: "abc-123"
My clientId:                    "def-456"
→ No match. Apply entity to local state. UI updates in real time.
```

No REST call is needed. The WebSocket event contains the complete entity, so the receiving client has everything it needs to render the change.

***

## Client Identity [#client-identity]

### What Is a Client ID? [#what-is-a-client-id]

A `clientId` is a UUID generated **per browser tab** when the application initializes. It is **not** tied to the user's authentication identity — a single user can have multiple client IDs across different tabs and devices.

| Property        | Value                                                                                   |
| --------------- | --------------------------------------------------------------------------------------- |
| **Scope**       | One per browser tab / app instance                                                      |
| **Lifetime**    | Created on tab open, discarded on tab close                                             |
| **Persistence** | Stored in `sessionStorage` (survives page refresh within the same tab, not across tabs) |
| **Format**      | UUIDv4 (e.g., `abc-123-def-456`)                                                        |

### Why Per-Tab, Not Per-Session? [#why-per-tab-not-per-session]

If the client ID were per-session (shared across tabs), a mutation from Tab A would suppress the WebSocket event in Tab B — meaning Tab B would never render the change. Per-tab IDs ensure that only the exact tab that initiated the mutation suppresses the echo.

### Why Client ID, Not Mutation ID? [#why-client-id-not-mutation-id]

Some architectures use a unique `mutationId` per operation instead of a persistent `clientId`. The trade-off:

| Approach                 | Pros                                                                                                  | Cons                                                                                                   |
| ------------------------ | ----------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------ |
| **Client ID** (Wordloop) | Simpler — one header, no per-mutation tracking state. Echo suppression is a single string comparison. | Cannot distinguish between two of *your own* rapid mutations on the same entity (both are suppressed). |
| **Mutation ID**          | Each operation is individually tracked. Can precisely reconcile specific operations.                  | Requires a pending-mutation queue on the client and mutation-ID propagation through the server.        |

Wordloop uses `clientId` because entity operations are independent and non-overlapping — a user does not typically create the same note twice in rapid succession. If Wordloop ever introduces collaborative editing where individual keystrokes must be tracked, mutation IDs would be required.

***

## Event Scoping [#event-scoping]

Not every connected client receives every event. Events are scoped to the relevant audience:

| Scope       | Events Delivered To                       | Example                                            |
| ----------- | ----------------------------------------- | -------------------------------------------------- |
| **User**    | All clients authenticated as that user    | `note.created`, `task.updated`, `meeting.deleted`  |
| **Meeting** | All clients viewing that specific meeting | `transcript.segment.produced`, `insight.generated` |

The WebSocket hub maintains a mapping of `userId → [clientId, clientId, ...]` and a subscription registry of `meetingId → [clientId, clientId, ...]`. When Core emits an event, the hub resolves the target audience and delivers to only those connections.

***

## Initial State Hydration [#initial-state-hydration]

When a client first loads, the WebSocket is not yet connected. The client must establish initial state before subscribing to real-time updates. The sequence is:

<Mermaid
  chart="`sequenceDiagram
  participant Client
  participant Core as &#x22;wordloop-core&#x22;
  participant WS as &#x22;WebSocket Hub&#x22;

  Note over Client: Page Load
  Client->>Core: GET /api/v1/meetings/:id (REST)
  Core-->>Client: Full meeting state (notes, tasks, topics)
  Client->>Client: Render initial UI

  Client->>WS: Connect WebSocket (JWT + clientId)
  WS-->>Client: Connection acknowledged

  Note over Client,WS: Now receiving real-time updates
  WS-->>Client: note.created (from another device)
  Client->>Client: Apply to UI
`"
/>

<Callout type="warn" title="Hydration Race Condition">
  Between the REST response and the WebSocket connection being established, events can be lost. To handle this, the client should include a `since` timestamp (from the REST response's latest `updatedAt`) in the WebSocket connection handshake. Core replays any events that occurred after that timestamp during the connection setup.
</Callout>

***

## Edge Cases [#edge-cases]

### Mutation Failure and Rollback [#mutation-failure-and-rollback]

If the REST call fails, the client must **undo the optimistic update** and surface the error. The rollback strategy depends on the error category:

| Error Category                | HTTP Status   | Retry? | Client Behavior                                                                                                      |
| ----------------------------- | ------------- | ------ | -------------------------------------------------------------------------------------------------------------------- |
| **Validation / Client Error** | 400, 409, 422 | ❌ No   | Roll back immediately. Surface error to user. The request is structurally wrong and retrying won't help.             |
| **Authentication Error**      | 401, 403      | ❌ No   | Roll back. Redirect to login or refresh token.                                                                       |
| **Not Found**                 | 404           | ❌ No   | Roll back. Entity was deleted by another client. Surface "item no longer exists" notification.                       |
| **Server Error**              | 500, 502, 503 | ✅ Yes  | Retry with exponential backoff + jitter (1s, 2s, 4s, max 3 retries). Roll back only after all retries are exhausted. |
| **Network Timeout**           | —             | ✅ Yes  | Retry once. If it still fails, roll back and surface ambiguous error: "Changes may not have been saved."             |

<Mermaid
  chart="`sequenceDiagram
  participant Client
  participant Core as &#x22;wordloop-core&#x22;

  Client->>Client: Apply optimistic update + store rollback
  Client->>Core: POST /notes
  Core-->>Client: 422 Validation Error

  Client->>Client: Restore rollback snapshot
  Client->>Client: Surface error toast to user
`"
/>

For **network timeouts**, the client cannot know whether the server received and processed the request. If the mutation did succeed server-side, the WebSocket event will eventually deliver the confirmed state — at which point the client should silently accept it rather than showing a duplicate.

### Optimistic ID Reconciliation [#optimistic-id-reconciliation]

When creating a new entity, the client uses a temporary ID (`temp_xxx`) for the optimistic update. When the REST response returns with the server-assigned ID, the client must **replace the temporary ID** everywhere it appears in local state:

```
Optimistic state:  { id: "temp_abc", content: "..." }
REST response:     { id: "note_01J...", content: "...", createdAt: "..." }
→ Replace temp_abc → note_01J in all local state references
```

The subsequent WebSocket echo is suppressed by `clientId` matching, so no further reconciliation is needed for the originating client.

### WebSocket Event Arrives Before REST Response [#websocket-event-arrives-before-rest-response]

The WebSocket event can arrive at the originating client **before** the REST response under high load. This is safe because:

1. Echo suppression discards the event regardless of timing (the `clientId` matches).
2. The REST response is the authoritative confirmation — it arrives independently and the client reconciles from it.

No special handling is required.

### Concurrent Mutations (Last-Write-Wins) [#concurrent-mutations-last-write-wins]

If two devices edit the same entity simultaneously, the **last write to reach the database wins**. Both REST calls succeed independently, and both produce WebSocket events. Each client receives the other client's update event and replaces its local state.

<Mermaid
  chart="`sequenceDiagram
  participant C1 as &#x22;Client A&#x22;
  participant Core as &#x22;wordloop-core&#x22;
  participant C2 as &#x22;Client B&#x22;

  Note over C1,C2: Both edit the same note simultaneously

  C1->>Core: PATCH /notes/:id (content: &#x22;Version A&#x22;)
  C2->>Core: PATCH /notes/:id (content: &#x22;Version B&#x22;)

  Core-->>C1: 200 OK (content: &#x22;Version A&#x22;)
  Core-->>C2: 200 OK (content: &#x22;Version B&#x22;)

  Note over Core: Last DB write wins: &#x22;Version B&#x22;

  Core-->>C1: WS note.updated (content: &#x22;Version B&#x22;)
  C1->>C1: Apply Version B (overwrites local Version A)

  Core-->>C2: WS note.updated (echo suppressed)
`"
/>

<Callout type="warn" title="No Conflict Resolution">
  Wordloop uses last-write-wins, not conflict resolution. This is appropriate for the current entity types (notes, tasks, meeting metadata) where conflicts are rare and the cost of a lost edit is low. If collaborative editing (e.g., simultaneous text editing within a note) is introduced, this section must be revisited with CRDTs or Operational Transform.
</Callout>

### Delete Race Condition [#delete-race-condition]

If Client A deletes an entity while Client B is editing it:

1. Client A's `DELETE` succeeds. Core publishes `note.deleted` over WebSocket.
2. Client B receives `note.deleted` and removes the entity from its UI — even if Client B has unsaved optimistic changes.
3. If Client B's `PATCH` arrives at Core **after** the delete, Core returns `404 Not Found`. Client B rolls back its optimistic update and surfaces the error.

The delete always wins. The client must handle the case where a `deleted` event arrives for an entity the user is currently editing by closing the editor and surfacing a notification.

### Stale Event Ordering [#stale-event-ordering]

Under network jitter or high load, WebSocket events for the same entity can arrive out of order. Each entity carries a `version` field (monotonically incrementing integer) and an `updatedAt` timestamp:

```
Current local state:    { id: "note_01J...", version: 3 }
Incoming WS event:      { id: "note_01J...", version: 2 }
→ Event version < local version. Discard as stale.
```

The client must **never apply an event whose version is less than or equal to the local version** for the same entity.

### Reconnection and Missed Events [#reconnection-and-missed-events]

When the WebSocket connection drops (network change, server restart, mobile backgrounding), events published during the disconnection window are lost:

<Mermaid
  chart="`sequenceDiagram
  participant Client
  participant WS as &#x22;WebSocket Hub&#x22;
  participant Core as &#x22;wordloop-core&#x22;

  Client->>WS: Connected (lastEventId: evt_050)
  WS-->>Client: Events flowing...

  Note over Client,WS: Connection drops

  Note over WS: Events evt_051, evt_052, evt_053 published
  Note over WS: Client is disconnected — events lost

  Client->>WS: Reconnect (lastEventId: evt_050)
  WS-->>Client: Replay evt_051, evt_052, evt_053
  WS-->>Client: Resume live stream
`"
/>

The client tracks the `id` of the last received event. On reconnection, it sends this as `lastEventId` in the handshake. Core replays all events after that ID from a short-lived event buffer before resuming the live stream.

**Reconnection strategy:** The client uses **exponential backoff with jitter** to avoid thundering-herd reconnection storms when the server restarts:

| Attempt | Base Delay | With Jitter (±30%) |
| ------- | ---------- | ------------------ |
| 1       | 1s         | 0.7s – 1.3s        |
| 2       | 2s         | 1.4s – 2.6s        |
| 3       | 4s         | 2.8s – 5.2s        |
| 4       | 8s         | 5.6s – 10.4s       |
| 5+      | 16s (cap)  | 11.2s – 20.8s      |

<Callout type="warn" title="Replay Window">
  The event buffer has a finite retention window. If the client has been disconnected longer than the buffer window, a replay is not possible. In this case, the client must perform a full state re-fetch via REST (the same hydration flow as initial page load) and then resume WebSocket subscription.
</Callout>

### Idempotency on REST Retry [#idempotency-on-rest-retry]

If a REST mutation times out and the client retries, the server may process the same mutation twice — producing two WebSocket events for a single user action.

For **create** operations, the client should generate and send an `Idempotency-Key` header. Core checks this key against a short-lived cache and returns the cached response if the key has been seen, preventing duplicate creation and duplicate WebSocket events.

For **update** and **delete** operations, natural idempotency applies — updating to the same values or deleting an already-deleted entity produces the same result.

```http
POST /api/v1/notes HTTP/1.1
Idempotency-Key: idem_7f3a9c...
X-Client-Id: abc-123
```

### Partial Server Failure [#partial-server-failure]

If the database write succeeds but the WebSocket broadcast fails (hub crash, network partition between Core and hub):

* **Originating client**: Receives the REST `201 Created` response and knows the mutation succeeded. Its optimistic update is confirmed.
* **Other clients**: Miss the WebSocket event and do not update their UI.

This is an eventually-consistent failure. Other clients will receive corrected state on their next REST fetch (page navigation, tab focus) or when the WebSocket reconnects and replays missed events. This is acceptable because the originating client — the device where the user performed the action — always sees the confirmed state.

### Rapid Mutations on the Same Entity [#rapid-mutations-on-the-same-entity]

If a user edits the same entity in rapid succession (typing a title, adjusting a slider), firing a REST call for every keystroke wastes bandwidth and creates ordering hazards where a slow early response overwrites a fast later one.

**Strategy: Debounce + Coalesce**

1. **Debounce the REST call.** Wait until the user pauses interaction (300–500ms of inactivity) before sending the mutation. The optimistic update still applies immediately on every keystroke — only the network request is debounced.
2. **Coalesce intermediate states.** Only the final state is sent to the server, not every intermediate value. If the user types "Hel", "Hell", "Hello" — the server receives one `PATCH` with `"Hello"`.
3. **Cancel stale in-flight requests.** If a new mutation fires while a previous one is still in-flight for the same entity, abort the previous request using `AbortController` to prevent a stale response from overwriting the newer state.

```
User types: H → He → Hel → Hello → [pauses 300ms]
Optimistic UI:  H → He → Hel → Hello (each applied immediately)
REST calls:    [none] → [none] → [none] → PATCH { content: "Hello" }
```

### Tab Focus Revalidation [#tab-focus-revalidation]

When a browser tab regains focus after being backgrounded, the WebSocket may have silently disconnected without triggering an error event (common on mobile browsers and laptop lid-close). The client should treat tab-focus as a trigger to:

1. **Check WebSocket health.** If the connection is dead, initiate reconnection with `lastEventId` replay.
2. **Revalidate stale queries.** SWR's `revalidateOnFocus` (or equivalent) re-fetches the current view's data via REST to catch any mutations that occurred while the tab was inactive.

This ensures the client is never silently stale after returning from background.

### WebSocket Authentication Lifecycle [#websocket-authentication-lifecycle]

The WebSocket connection authenticates with a JWT during the initial handshake. Since JWTs have a finite lifetime, the connection must handle token expiry and session revocation:

**Token Refresh (Proactive):**

1. The client monitors its JWT expiration. A few minutes before expiry, it refreshes the token via the standard Clerk token refresh.
2. The client sends an `auth.refresh` message over the *existing* WebSocket with the new token.
3. Core validates the new token and associates it with the connection. No reconnection is needed.

**Session Revocation (Server-Initiated):**

1. When a user logs out from any device, or an admin revokes access, Core sends a `session.revoked` event to **all** WebSocket connections for that user.
2. Each client receives the event, closes the WebSocket, clears local state, and redirects to the login screen.
3. Core terminates the server-side connection after sending the event.

**Token Expired (Reactive):**

1. If the token expires without a proactive refresh (client was backgrounded), Core sends a WebSocket close frame with code `4401` (custom "Unauthorized" code).
2. The client refreshes its token and reconnects with the new JWT.

### Cache Reconciliation on Settled [#cache-reconciliation-on-settled]

After every mutation — whether it succeeds or fails — the client should revalidate the affected SWR cache key to ensure the local cache matches the server's authoritative state. This is the `onSettled` pattern:

1. **On success:** The REST response already contains the server-authoritative entity. The client updates the SWR cache with this response. A background revalidation is triggered to catch any concurrent mutations from other devices that may have occurred during the request.
2. **On error:** The rollback restores the snapshot, and a revalidation fetches the current server state to ensure the cache is clean.

This guarantees that even if echo suppression, version comparison, or reconnection logic has a subtle bug, the cache self-heals within one mutation cycle.

***

## What This Pattern Does NOT Cover [#what-this-pattern-does-not-cover]

| Concern                      | Handled By                                                                                                                                                                                |
| ---------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **Audio streaming**          | Dedicated binary WebSocket pipeline — see [Real-Time WebSocket Streaming](/docs/learn/architecture/system-workflows#2-real-time-websocket-streaming)                                      |
| **ML-generated events**      | Transcript segments and insights originate from Pub/Sub consumers, not REST mutations. These always flow through the WebSocket without echo suppression because no client initiated them. |
| **Authentication**           | JWT validation on REST and WebSocket handshake — see [Authentication](/docs/learn/architecture/auth)                                                                                      |
| **Pub/Sub worker pipelines** | Asynchronous inter-service communication — see [Unified Asynchronous Meeting Finalization](/docs/learn/architecture/system-workflows#1-unified-asynchronous-meeting-finalization)         |


# Infrastructure & Hosting (/docs/learn/architecture/infrastructure)


# Infrastructure & Hosting [#infrastructure--hosting]

Wordloop deploys entirely to managed Google Cloud serverless infrastructure in production. For information about local development and emulation, see the [Local Infrastructure](local-infrastructure.md) page.

## Production Hosting [#production-hosting]

| Service           | GCP Resource     | Description                                                                            |
| ----------------- | ---------------- | -------------------------------------------------------------------------------------- |
| **wordloop-docs** | Firebase Hosting | Next.js/Fumadocs static site deployment.                                               |
| **wordloop-app**  | Cloud Run        | Next.js server utilizing SSR and Route Handlers.                                       |
| **wordloop-core** | Cloud Run        | Go REST API.                                                                           |
| **wordloop-ml**   | Cloud Run (x2)   | Deployed as two separate services: an HTTP web server and a Pub/Sub background worker. |
| **Database**      | Cloud SQL        | Managed Postgres 15 database instance.                                                 |
| **Messaging**     | Cloud Pub/Sub    | Managed topics and subscriptions.                                                      |

### API Routing (Production) [#api-routing-production]

To ensure the frontend is environment-agnostic, the Next.js `wordloop-app` implements a **Server-Side API Proxy**.

* All frontend fetches are directed to `/api/...`.
* A Next.js Route Handler proxies these requests to the underlying `wordloop-core` URL (defined via the `CORE_API_URL` environment variable at runtime).
* This prevents hardcoding backend URLs during the Next.js build step.

## Environment Configuration [#environment-configuration]

Configuration relies exclusively on environment variables injected at runtime. There are NO configuration files deployed with the containers. See individual service handbooks for specifics.


# Local Infrastructure (/docs/learn/architecture/local-infrastructure)


# Local Infrastructure & Emulation [#local-infrastructure--emulation]

Wordloop utilizes a **hybrid local-first development model** orchestrated via our custom `./dev` CLI.

Instead of running the entire stack in heavy, monolithic Docker containers, we segment the environment:

* **Infrastructure & Observability (Docker):** Stateful services (Postgres, Pub/Sub emulator, Storage emulator) and telemetry tools (Aspire Dashboard) run in Docker.
* **Application Services (Native):** Code bases (`wordloop-core`, `wordloop-ml`, `wordloop-app`, `wordloop-docs`) run natively on your host machine. We use file monitoring tools (`air` for Go, `uvicorn` for Python, and Next.js dev server) to enable **instant hot-reloading**, bypassing the need to rebuild Docker images after every code change.

## Local Port Architecture [#local-port-architecture]

To prevent port collisions, all services follow a well-structured port layout:

### Application Services (Native) [#application-services-native]

| Service           | Internal Target     | Port   | Tooling                |
| ----------------- | ------------------- | ------ | ---------------------- |
| **wordloop-app**  | Next.js Frontend    | `4001` | `next dev`             |
| **wordloop-docs** | Fumadocs Site       | `4000` | `next dev`             |
| **wordloop-core** | Go REST API         | `4002` | `air` (hot-reload)     |
| **wordloop-ml**   | Python API & Worker | `4003` | `uvicorn` (hot-reload) |

### Infrastructure Spaces (Docker) [#infrastructure-spaces-docker]

| Service              | Image / Role                  | Port    |
| -------------------- | ----------------------------- | ------- |
| **Aspire Dashboard** | Local Observability UI        | `18888` |
| **Postgres**         | `postgres:15`                 | `5432`  |
| **Pub/Sub**          | `cloud-sdk:emulators`         | `8085`  |
| **Storage (GCS)**    | `oittaa/gcp-storage-emulator` | `8086`  |

* **Statefulness:** Postgres data is persisted in a local Docker volume (`db_data`). Emulators spin up ephemerally. The Core service programmatically provisions required Pub/Sub topics and buckets on boot.
* **Bootstrapping:** Use `./dev start all` to bring up the Docker infra and native host services concurrently. Run `./dev help` for more granularity.

## Environment Configuration [#environment-configuration]

Configuration relies exclusively on environment variables injected at runtime. There are NO configuration files deployed with the containers. See individual service handbooks for specifics.


# Observability (/docs/learn/architecture/observability)


# Observability [#observability]

Instead of emitting fragmented logs, metrics, and traces, we generate high-cardinality, wide events (Spans) using &#x2A;*OpenTelemetry (OTel)**. These spans serve as the single source of truth for the health, performance, and behavior of the entire platform.

## Tracing Architecture [#tracing-architecture]

We utilize W3C Trace Context headers to propagate traces across every service boundary, ensuring that identity and context are never severed from the symptom.

<Mermaid
  chart="`graph LR
  APP[&#x22;wordloop-app&#x22;] -->|&#x22;traceparent&#x22;| CORE[&#x22;wordloop-core&#x22;]
  CORE -->|&#x22;Postgres span&#x22;| DB[(Postgres)]
  CORE -->|&#x22;traceparent&#x22;| PS[&#x22;Pub/Sub Message&#x22;]
  ML[&#x22;wordloop-ml&#x22;] -->|&#x22;extracts trace&#x22;| PS
  ML -->|&#x22;AssemblyAI span&#x22;| AAI[&#x22;AssemblyAI API&#x22;]
  ML -->|&#x22;traceparent&#x22;| CORE
`"
/>

* **App (Next.js):** Generates the root span for user interactions, authenticates via Clerk, and injects `clerk_user_id` into OTel Baggage as `enduser.id`.
* **Core (Go):** Uses `otel/sdk/go` to trace HTTP handles, Postgres queries (via pgx), and Pub/Sub publishing. It automatically reads W3C Baggage from incoming requests and propagates it via Pub/Sub attributes.
* **ML (Python):** Uses `opentelemetry-python` to extract spans and identity Baggage from incoming Pub/Sub messages, trace ML pipelines, and propagate context when calling Core.

### Span-Derived Metrics [#span-derived-metrics]

We do **not** manually instrument and roll up traditional RED (Rate, Errors, Duration) metrics at runtime. Emitting isolated metrics destroys the context necessary for debugging.

Instead, our system relies on dynamic aggregations of our wide spans. Because every span contains the exact duration, status code, and rich metadata (tenant IDs, roles), our observability backend continuously calculates and visualizes RED metrics derived directly from the trace stream. If an aggregate error rate spikes, engineers can simply click the spike to see the exact traces that generated it.

## Logging [#logging]

To ensure structural consistency, all logs are written as structured JSON and natively integrate the OpenTelemetry context.

* **Go Logging:** Implemented via `slog` with an OpenTelemetry handler.
* **Python Logging:** Implemented via `structlog` naturally wrapping the OTel context.

Every log emitted within the scope of a request automatically inherits the `trace_id` and `span_id`, allowing developers to find any application log by looking at its parent trace.

## Telemetry Destinations & Sampling [#telemetry-destinations--sampling]

Our services act purely as OTLP (OpenTelemetry Protocol) emitters. They never communicate directly with the final observability storage backend. Data routing and sampling are centrally managed.

### Local Development (.NET Aspire) [#local-development-net-aspire]

Locally, all services export OTLP data to the **.NET Aspire Dashboard**.

1. Run `./dev dash obs` (or start it automatically via `./dev start infra`).
2. Access the UI at [http://localhost:18888](http://localhost:18888).
3. You can view Traces, Metrics, and Structured Logs across all containers in real-time. Since `enduser.id` Baggage is propagated, you can search for a user's exact ID to trace their entire session timeline end-to-end.

### Production Pipeline & Tail-Based Sampling [#production-pipeline--tail-based-sampling]

In production, SDKs do not push directly to Google Cloud. We deploy instances of the **OpenTelemetry Collector Gateway** to act as an intermediary buffer.

<Mermaid
  chart="`graph LR
  SERVICES[&#x22;Wordloop Services (Go, Python)&#x22;] -->|&#x22;OTLP gRPC (100% traces)&#x22;| COLLECTOR[&#x22;OTel Collector Gateway&#x22;]
  COLLECTOR -->|&#x22;Tail-Based Sampling Processor&#x22;| GCP[&#x22;Google Cloud Trace / Logging&#x22;]
`"
/>

Because we employ **Tail-Based Sampling** for financial responsibility, the Collector buffers the entire distributed trace. Once the trace is complete, the Collector executes our sampling rules:

* **100% Sampling for Errors & High Latency:** If any span anywhere in the trace breaches our latency threshold or contains an error, the entire trace is preserved and exported to Google Cloud.
* **5% Sampling for Happy Paths:** If the request succeeded without anomalies, we drop 95% of them at the Collector level to save ingest and storage costs without sacrificing visibility into system failures.


# System Architecture (/docs/learn/architecture/overview)


# System Architecture Overview [#system-architecture-overview]

Wordloop is a localized, intelligence-first platform structured so that each service owns an isolated domain boundary, communicating through strictly typed, declarative contracts.

## High-Level Topology [#high-level-topology]

<Mermaid
  chart="`graph TB
  subgraph Client
      APP[wordloop-app<br/>Next.js :4001]
  end

  subgraph Auth Providers
      CLERK[Clerk Auth]
  end

  subgraph Backend
      CORE[wordloop-core<br/>Go :4002]
      ML[wordloop-ml<br/>FastAPI :4003]
  end

  subgraph Infrastructure
      PG[(Postgres)]
      PS[Pub/Sub]
      GCS[Storage]
  end

  APP -->|Authenticate| CLERK
  CLERK -->|JWT| APP
  APP -->|&#x22;REST — CUD Mutations (JWT)&#x22;| CORE
  APP <-->|&#x22;WebSocket — Streaming Reads&#x22;| CORE
  CLERK -->|Webhooks| CORE
  
  CORE --> PG
  CORE -->|Publish| PS
  CORE --> GCS
  
  ML -->|Subscribe| PS
  ML --> GCS
  ML -->|&#x22;REST (Service Token)&#x22;| CORE`"
/>

## Service Boundaries [#service-boundaries]

The platform is decoupled into three primary execution domains:

### `wordloop-core` (Go) [#wordloop-core-go]

The absolute system of record. Responsible for transactional orchestration, state management, Clerk webhook syncing, and exposing the primary REST API via [Huma](https://huma.rocks).

* [Core Service Handbook](../services/core/index.md)

### `wordloop-ml` (Python) [#wordloop-ml-python]

The async intelligence engine. Stateless, event-driven, and built on FastAPI. It consumes Pub/Sub events from Core, interfaces with external APIs (AssemblyAI), and uses a symmetric service token to push structured data back to Core.

* [ML Service Handbook](../services/ml/index.md)

### `wordloop-app` (Next.js) [#wordloop-app-nextjs]

The presentation layer built on React Server Components. Authenticates via Clerk and communicates with Core via Orval-generated API clients wrapped in a Next.js server-side proxy route.

* [App Service Handbook](../services/app/index.md)

## Communication Patterns [#communication-patterns]

The client–server data architecture follows the &#x2A;*[Optimistic Mutation with Echo-Suppressed Streaming](data-flow)** pattern: REST for writes, WebSocket for reads, with optimistic UI and source-aware echo suppression for multi-device sync. Contracts act as the sole source of truth. Hand-written API clients are forbidden.

| Pattern                 | Mechanism                                                                                           |
| ----------------------- | --------------------------------------------------------------------------------------------------- |
| **Mutations (CUD)**     | REST via Orval-generated clients. Optimistic UI with rollback. Next.js proxies to circumvent CORS.  |
| **Streaming Reads**     | WebSocket pushes complete entity payloads on every state change. Echo suppressed via `X-Client-Id`. |
| **Worker Dispatch**     | GCP Pub/Sub utilizing strict AsyncAPI schemas for inter-service async work.                         |
| **Internal Writebacks** | Internal REST calls authenticated via strict Service Tokens (ML → Core).                            |
| **Identity Sync**       | Webhooks from Clerk ingested to local Postgres `users` table.                                       |

See the dedicated documentation for [Authentication](auth.md), [Data Flow](data-flow), [Observability](observability.md), and [Hosting](infrastructure.md).


# System Workflows (/docs/learn/architecture/system-workflows)


# System Workflows [#system-workflows]

This document outlines the vital data pipelines and chronological component interactions driving the Wordloop platform.

## 1. Unified Asynchronous Meeting Finalization [#1-unified-asynchronous-meeting-finalization]

WordLoop utilizes a singular background processing pipeline capable of finishing *both* batch-uploaded raw audio files and finalizing severed/abandoned WebSocket Live meetings.

<Mermaid
  chart="`sequenceDiagram
  participant App as &#x22;wordloop-app&#x22;
  participant Core as &#x22;wordloop-core&#x22;
  participant GCS as &#x22;GCP Storage&#x22;
  participant PS as &#x22;Pub/Sub&#x22;
  participant ML as &#x22;wordloop-ml&#x22;
  
  %% Trigger Path A: Manual File Upload
  App->>Core: POST /meetings (Audio Upload)
  Core->>GCS: Persist full audio.wav
  %% Trigger Path B: Completed Live Meeting
  App->>Core: WS disconnect / Event Stop
  Core->>GCS: finalize io.Pipe() streaming blob
  
  Core->>PS: Publish TranscriptionJobMessage
  
  PS->>ML: Deliver TranscriptionJobMessage
  ML->>Core: GET /transcriptions/:id/segments (Fetch any partials)
  ML->>GCS: Download complete audio.wav
  ML->>ML: Diff: Audio Duration vs Existing Segment Timestamps
  
  alt Missing Audio Detected (Self-Healing)
      ML->>ML: Slice out untranscribed raw tail
      ML->>ML: Transcribe (AssemblyAI)
      ML->>Core: POST /transcriptions/:id/segments (append new)
  end
  
  ML->>ML: generate_insights(all_segments)
  ML->>Core: Store Task & Topic entities
  ML->>Core: PATCH meeting (summary, headline)
`"
/>

<Callout type="warn" title="Resilient Synthesis">
  By deferring all complex generation algorithms to the asynchronous `TranscriptionJobMessage`, Wordloop protects the stateful live recording connections from cascading OOM crashes while ensuring offline tasks natively self-heal broken streams.
</Callout>

## 2. Real-Time WebSocket Streaming [#2-real-time-websocket-streaming]

The synchronous audio pipeline designed around high-availability, zero-in-memory-buffering, and multi-endpoint data dispersion.

<Mermaid
  chart="`sequenceDiagram
  participant App as &#x22;wordloop-app&#x22;
  participant Core as &#x22;wordloop-core&#x22;
  participant GCS as &#x22;GCP Storage&#x22;
  participant ML as &#x22;wordloop-ml&#x22;
  participant AAI as &#x22;AssemblyAI&#x22;
  
  App->>Core: WebSocket Connect /ws/stream
  Core->>GCS: Open io.Pipe() Stream
  Core->>ML: WebSocket Connect /ws/transcribe
  ML->>AAI: WSS Connect (AssemblyAI Streaming)
  
  loop Every Audio Frame
      App->>Core: Binary Audio Data
      %% Dispersed natively holding no pod memory footprint
      par 
          Core->>GCS: Write byte to Pipe
      and
          Core->>ML: Relay Binary Frame
      end
      ML->>AAI: Relay Framed Data
  end
  
  loop Real-Time Words
      AAI-->>ML: Final Transcript Segment Threshold Hit
      ML->>Core: POST /transcriptions/:id/segments (REST)
      
      Note over Core: TranscriptionService.AddTranscriptSegments()
      par Persist + Broadcast (Optimistic Mutation)
          Core->>Core: Persist Segment to DB
          Core-->>App: WebSocket Event (entity_changed)
      end
      
      par Extract Live Insights
          ML->>ML: LLM Extract (Tasks, Topics, Points)
          ML->>Core: POST /meetings/:id/talking_points (REST)
          ML->>Core: POST /tasks (REST)
          
          Note over Core: Service Layer → DB → WebSocket Hub
          Core-->>App: WebSocket Event (entity_changed)
      end
  end
  
  App->>Core: stop_recording
  Core->>GCS: Close io.Pipe() EOF
  Core->>Core: Dispatch -> 1. Unified Asynchronous Meeting Finalization
`"
/>

## 3. Voice Context Pipelines [#3-voice-context-pipelines]

Workflows for orchestrating speaker identity, embeddings, and context.

<Callout type="info">
  Vector matching operations are computationally intensive. The frontend must expect varying latency when querying nearest neighbors.
</Callout>

<Mermaid
  chart="`sequenceDiagram
  participant User
  participant App as &#x22;wordloop-app&#x22;
  participant ML as &#x22;wordloop-ml&#x22;
  participant Core as &#x22;wordloop-core&#x22;
  
  Note over User, ML: Voice Enrollment
  User->>App: Record/Upload voice samples
  App->>ML: POST /voice/{person_id}/add (multipart audio)
  ML->>ML: Extract SpeechBrain Embeddings
  ML->>Core: Insert Embedding for person_id    
  
  Note over User, ML: Voice Matching
  User->>App: Record new audio segment
  App->>ML: POST /voice/match (multipart audio)
  ML->>ML: Extract SpeechBrain Embeddings
  ML->>Core: Search Nearest Neighbors (Vector Match)
  Core-->>ML: Best Matches (Person IDs) + Distances
  ML-->>App: Top-K Matches
`"
/>

## 4. AI Chat Context Orchestration [#4-ai-chat-context-orchestration]

Retrieving meeting context for intelligent conversational RAG queries.

<Mermaid
  chart="`sequenceDiagram
  participant User
  participant App as &#x22;wordloop-app&#x22;
  participant Core as &#x22;wordloop-core&#x22;
  participant OAI as &#x22;OpenAI&#x22;

  User->>App: Send chat message
  App->>Core: POST /ai/threads/:id/messages
  Core->>Core: Load thread + meeting context
  Core->>OAI: Chat completion (with tools)
  OAI-->>Core: Response (possibly with tool calls)
  Core->>Core: Execute tool calls (search, lookup)
  Core->>OAI: Follow-up with tool results
  OAI-->>Core: Final response
  Core-->>App: Chat message response
  App-->>User: Display response
`"
/>


# Concepts (/docs/learn/concepts)


# Concepts [#concepts]

A shared vocabulary is not a cosmetic concern. When every engineer on the team means the same thing by "segment," "synthesis," or "task," design conversations become faster and bugs become easier to describe. This page is the canonical glossary of the domain; use it when writing code, specs, or tests.

## Core entities [#core-entities]

**Meeting** — the primary unit of work in Wordloop. A Meeting is a bounded session captured in the system, tied to a user, optionally attended by multiple People, and producing a Transcription, a MeetingSynthesis, and Tasks. The `meetings` table and the `/meetings` routes are the center of gravity for the entire platform.

**Person** — a contact record representing an attendee of one or more Meetings. A Person carries identity fields (display name, email, title, company) and an optional voice model used to attribute TranscriptSegments to a speaker. People are distinct from Users — a User is someone with a Wordloop account; a Person is someone who appeared in a meeting, with or without an account.

**Transcription** — the speech-to-text record attached to a Meeting. A Transcription aggregates TranscriptSegments as they are produced in near-real-time by the ML service and reaches a `completed` status when the meeting closes.

**TranscriptSegment** — the atomic unit of the Transcription. Each segment carries a speaker label, the attributed Person (if matched), text, start and end timestamps, a confidence score, and a `is_final` flag. Most ML processing — embeddings, topic extraction, synthesis — operates over segments.

**MeetingSynthesis** — the AI-generated summary attached to a Meeting. Contains a headline, a prose summary, key points, a list of Topics, and nested TalkingPoints. Produced by the ML service after the Transcription finalises; can be regenerated on demand.

**Topic** — a thematic cluster extracted from a Meeting's segments. Topics carry a name, a summary, and the set of TranscriptSegments that contributed to them. A Meeting has many Topics; a Topic belongs to one Meeting.

**TalkingPoint** — a specific point or claim within a Topic. TalkingPoints are the most granular unit of the MeetingSynthesis, surfaced in the recap UI as bullets under each Topic.

**Task** — an action item extracted from a Meeting. Tasks are assignable, trackable, and hierarchical (via `parent_task_id`). They live beyond the Meeting itself and are the primary output a user acts on after review. Status values: `pending`, `in_progress`, `completed`, `canceled`.

## Supporting entities [#supporting-entities]

**User** — a Wordloop account holder, identified via Clerk. A User has an associated Person record (the voice model and contact info for their own participation in meetings). JIT-provisioned on first sign-in.

**Note** — a free-form annotation attached to any entity (`meeting`, `person`, `task`, etc.) via a polymorphic `subject_type` / `subject_id` pair.

**Tag** — a label a user can apply to Meetings, People, or Tasks for organisation.

## Cross-cutting concepts [#cross-cutting-concepts]

**Voice model** — the speaker-identification vector attached to a Person. The ML service matches incoming audio against stored voice vectors to attribute TranscriptSegments to a specific Person rather than an anonymous `SpeakerLabel`. Voice models are built incrementally from verified segments.

**JIT provisioning** — "just-in-time" user creation. When a user signs in via Clerk for the first time, the Core API reads their Clerk profile and creates both the local User record and the corresponding Person record on demand. No webhooks, no seeding.

**Echo suppression** — the mechanism by which a person's own outgoing audio is not re-ingested as incoming segments. A subtle but load-bearing piece of the real-time pipeline; see [Real-Time principles](/docs/principles/system-design/real-time) for the design model.

## Further reading [#further-reading]

* [Architecture Overview](/docs/learn/architecture/overview) — how these entities are distributed across services.
* [Data Flow](/docs/learn/architecture/data-flow) — the lifecycle of a segment, from microphone to synthesis.
* [Reference / Glossary](/docs/reference/glossary) — the complete, link-resolvable vocabulary.


# Platform Services (/docs/learn/services)


# Platform Services [#platform-services]

Wordloop is composed of four services, each with a distinct responsibility, language, and runtime. This section contains one handbook per service — how it is structured, what it owns, and how to work on it.

<Cards>
  <Card title="wordloop-core" href="/docs/learn/services/core" description="Go HTTP and WebSocket API. Source of truth for Loops, turns, participants, and real-time session state." />

  <Card title="wordloop-ml" href="/docs/learn/services/ml" description="Python FastAPI runtime for transcription, recap generation, and embedding. Hexagonal architecture." />

  <Card title="wordloop-app" href="/docs/learn/services/app" description="Next.js 16 frontend. Server-rendered shell with streaming React Server Components and SWR for live state." />
</Cards>

The documentation site itself (`wordloop-docs`) is a fourth deployable but is treated as a piece of platform tooling rather than an application surface; it is documented via the [Reference](/docs/reference) and [Guides](/docs/guides) sections.


# Runbooks (/docs/operations/runbooks)


# Runbooks [#runbooks]

A runbook is a script. It is written so a tired, stressed engineer can follow it at 3am and restore service without having to reason from first principles. Each runbook in this section targets a specific, recognisable failure symptom and walks through detection, diagnosis, mitigation, and recovery.

## Runbook authoring [#runbook-authoring]

New runbooks are welcome — every incident we resolve should teach the team one. The template:

```markdown
# Runbook: <symptom>

**Owner:** <team>
**Last tested:** YYYY-MM-DD
**Pager rule:** <alert name>

## Goal
Restore <X> when <Y>.

## Detection
How to confirm this is the failure you are hitting.

## Diagnosis
Fast checks to localise the fault.

## Mitigation
Immediate actions to restore user-facing health.

## Recovery
Steps to return to a fully healthy state.

## Rollback
How to undo each state-changing step.

## Escalation
When and whom to escalate to.

## Postmortem
Link to the incident doc once one exists.
```

## Available runbooks [#available-runbooks]

*The catalogue is populated as real incidents drive new runbooks. Writing a runbook "just in case" is usually wasted effort; writing one in the follow-up from an actual incident captures the specific, sharp-edged lessons a generic version would miss.*

See [On-Call](/docs/operations/on-call) for rotation logistics and [Troubleshooting](/docs/operations/troubleshooting) for exploratory diagnostic trees.


# Agent-Native Systems (/docs/principles/ai-native/agent-native-systems)


# Agent-Native Systems [#agent-native-systems]

## TL;DR [#tldr]

AI agents read our APIs, our events, and our documentation programmatically. Building agent-native systems means designing every interface — contract, spec, doc page — so that an agent can consume it without a human translator in the loop. MCP for structured tool surfaces, `llms.txt` for discoverable documentation, stable error codes, rich OpenAPI examples — the pieces compose into a system agents can work inside.

## Why this matters [#why-this-matters]

The organisation that takes agent-readiness seriously in 2026 gets a multiplier on every engineer's output. Agents write code faster, answer questions faster, and onboard faster when the systems they are working against are designed for them. The organisation that treats agent-readiness as an afterthought pays the cost in a constant low-grade friction: agents that need babysitting, outputs that need correction, onboarding that requires a human bootstrapping step for every task. The investment is modest; the return compounds.

## Our principles [#our-principles]

### 1. Every interface has a machine-consumable specification [#1-every-interface-has-a-machine-consumable-specification]

HTTP endpoints have OpenAPI; events have AsyncAPI; documentation has `llms.txt` and `.md` exports; the tools an agent should use have MCP schemas. An interface without a machine-consumable spec is off-limits to agents by default.

### 2. Specifications include descriptions, examples, and constraints [#2-specifications-include-descriptions-examples-and-constraints]

A spec that says a field is `string` without saying what the string represents is a spec an agent cannot use correctly. We write descriptions, give examples, enumerate finite domains, and state constraints explicitly. The standard is: a competent agent should be able to use the interface without reading the implementation.

### 3. MCP is our standard tool surface [#3-mcp-is-our-standard-tool-surface]

When we want agents to interact with Wordloop beyond reading, we expose the capability through a Model Context Protocol server. Tools are typed, documented, and error-reporting; resources are typed and fetchable. A bespoke prompt-engineering integration is a deprecated pattern — MCP is the interop.

### 4. `llms.txt` and `.md` exports are shipped alongside docs [#4-llmstxt-and-md-exports-are-shipped-alongside-docs]

Every docs site ships `llms.txt` (the index) and `llms-full.txt` (the consolidated corpus), plus a `.md` export for every page. Agents navigate the docs the same way a human would, but through a plain-text channel that does not require HTML parsing.

### 5. Error responses are structured, stable, and actionable [#5-error-responses-are-structured-stable-and-actionable]

Every error carries a stable code, a human message, and machine-readable details. The code is catalogued in [Reference / Errors](/docs/reference/errors) and never renumbered. Agents branch on codes; they do not parse prose. This is the single highest-leverage API hygiene choice for agent-readiness.

### 6. Idempotency enables retry [#6-idempotency-enables-retry]

Agents retry. Systems that penalise retry — duplicate records, doubled charges, phantom events — cannot be worked against reliably. Every write endpoint accepts an idempotency key ([API Design](/docs/principles/system-design/api-design)); every event consumer is de-duplicating ([Integration Patterns](/docs/principles/system-design/integration-patterns)).

### 7. Outputs are structured where it matters [#7-outputs-are-structured-where-it-matters]

When an agent is producing a structured result — a database record, an API payload, a configuration fragment — we use schema-constrained generation (JSON schema, tool calling) rather than free-text-then-parse. Free-text parsing is how agent pipelines become brittle.

### 8. Documentation is reviewed for agent consumption [#8-documentation-is-reviewed-for-agent-consumption]

When we write a page, we ask: would an agent reading this through MCP understand what to do? If the page assumes visual hierarchy, colour, or context that does not survive serialisation, we re-shape it. Agent-readiness is a docs quality attribute, not a separate track of work.

## How we apply this [#how-we-apply-this]

* [/llms.txt](/llms.txt) and [/llms-full.txt](/llms-full.txt) — the canonical entry points.
* The MCP server at `scripts/mcp-server.ts` — the current tool and resource surface.
* [API Design](/docs/principles/system-design/api-design) — the OpenAPI discipline that makes our APIs agent-consumable.
* [Documentation](/docs/principles/foundations/documentation) — the dual-audience docs stance.

## Anti-patterns we reject [#anti-patterns-we-reject]

* **Auth flows that require human interaction.** A consent screen with a "click here" button is a dead end for an automated client. Design auth that supports programmatic token issuance.
* **Prose-only error responses.** `"something went wrong"` is unusable by any automated caller.
* **Undocumented "internal" APIs.** An API without a spec is an API that agents cannot use — which means humans will be asked to do the thing an agent should be doing.
* **MCP tools that wrap everything.** An MCP server that mirrors every endpoint in the API is noise. Expose the capabilities agents actually need, named in the agent's vocabulary.
* **Documentation that leans on rendered visuals.** An architecture diagram nobody can parse from Markdown is a diagram an agent cannot read. Prefer Mermaid source in the Markdown.

## Further reading [#further-reading]

* *Model Context Protocol* ([modelcontextprotocol.io](https://modelcontextprotocol.io)) — the canonical MCP specification.
* *llms.txt specification* ([llmstxt.org](https://llmstxt.org)) — the dual-audience docs convention.
* *OpenAPI Specification* ([openapis.org](https://www.openapis.org)) — the HTTP contract format.
* *Anthropic's agent engineering posts* — practical patterns for building agents against real APIs.
* *Simon Willison's blog* ([simonwillison.net](https://simonwillison.net)) — ongoing, practical commentary on the state of tooling.


# AI Engineering (/docs/principles/ai-native/ai-engineering)


# AI Engineering [#ai-engineering]

## TL;DR [#tldr]

AI engineering is software engineering with a non-deterministic component in the loop. We treat prompts as code, evaluations as tests, context as a first-class design surface, and agents as distributed systems. The discipline is about making probabilistic systems behave predictably enough to ship.

## Why this matters [#why-this-matters]

Every team that has tried to ship an AI feature has learned the same lesson the hard way: the part that feels like magic in a demo is the part that fails in unpredictable ways in production. The gap between "it works in the playground" and "it works for every user, every day" is where AI engineering happens. The discipline treats the non-determinism as an engineering problem — measurable, testable, and addressable — rather than as an inherent limitation to shrug at.

## Our principles [#our-principles]

### 1. Prompts are code [#1-prompts-are-code]

Prompts live in version control, are reviewed, are tested, and are versioned. A prompt change is a code change; it ships through the same PR review as any other change. "We tweaked the prompt in the dashboard" is how a team loses the ability to reason about its own AI behaviour.

### 2. Evals are tests [#2-evals-are-tests]

Every meaningful AI behaviour has an eval: a scored comparison of model output against a reference. Evals run in CI; thresholds are committed; regressions block merge the same way unit-test failures do. Without evals, "did we make the model worse?" is unanswerable, which means every improvement is also a potential regression you will discover from users.

### 3. Context is the interface [#3-context-is-the-interface]

The content of the context window — what system prompt, what few-shot examples, what retrieved documents, what tool outputs — is the single biggest lever on model behaviour. We design it deliberately, measure its token budget, and treat it as a first-class interface. "Throw in everything relevant" is the anti-pattern that blows up the bill and dilutes the signal.

### 4. Retrieval matters more than the model [#4-retrieval-matters-more-than-the-model]

For most RAG systems, the retrieval layer determines the ceiling. A clever model with bad retrieval gives confident nonsense; a boring model with good retrieval gives boring, correct answers. We invest in the retrieval quality — indexing, ranking, reranking, chunk boundaries — before we invest in the model choice.

### 5. Model outputs are validated at the boundary [#5-model-outputs-are-validated-at-the-boundary]

Every model output that crosses into code is validated: shape, length, content, and expected enumerations. Parse failures are handled explicitly, not allowed to propagate. A model output flowing into business logic without validation is an injection vector waiting to happen.

### 6. Agents are distributed systems [#6-agents-are-distributed-systems]

An agent loop — model plans, model takes action, agent observes, model re-plans — has all the problems of a distributed system: retries, idempotency, timeouts, failure isolation. We apply the same patterns ([Integration Patterns](/docs/principles/system-design/integration-patterns)): bounded retries, circuit breakers, auditable history. The hardest agent failures are system failures, not model failures.

### 7. Cost is part of the evaluation [#7-cost-is-part-of-the-evaluation]

A prompt that is 10% better but 5× more expensive is not obviously better. Evals track quality, latency, *and* cost, and decisions about which configuration to ship consider all three. Cost-unaware evaluation is how an AI feature becomes a cost incident after launch ([Cost Engineering](/docs/principles/delivery/cost-engineering)).

### 8. Human oversight is designed in [#8-human-oversight-is-designed-in]

For high-stakes AI outputs — a recap that a user will act on, an automated action taken on behalf of a user — we design the review point deliberately. The human reviewer gets a summary, not a wall of text; the review UX is built alongside the AI feature, not retrofitted. "Let the model do it" without a review loop is a promise the model will eventually fail.

## How we apply this [#how-we-apply-this]

* [ML Systems](/docs/principles/stack/ml-systems) — the implementation principles for the Python ML service.
* [Agent-Native Systems](/docs/principles/ai-native/agent-native-systems) — the flip side, making our interfaces consumable by agents.
* [Observability](/docs/principles/quality/observability) — the trace surface for model calls.
* [Testing](/docs/principles/foundations/testing) — the broader testing discipline evals sit inside.

## Anti-patterns we reject [#anti-patterns-we-reject]

* **"The model will figure it out."** Hope is not a design.
* **Prompts as configuration.** Untracked prompts drift silently, and evals cannot catch drift they are not told about.
* **Over-stuffed context windows.** Throwing the kitchen sink at the model is usually how quality *decreases*.
* **Skipping evals "this once."** This once becomes always. Evals compound when you have them and compound against you when you do not.
* **Agent loops without termination.** A loop without a clear exit condition is how a runaway agent becomes a runaway bill.
* **Deterministic reasoning on top of probabilistic output.** If you need a number, ask for a number in a structured schema. Do not regex-extract it from prose.

## Further reading [#further-reading]

* *Prompt Engineering Guide* ([promptingguide.ai](https://www.promptingguide.ai)) — the practitioner's summary of current patterns.
* *Evaluating and Reinforcing LLM Behaviors*, Shreya Shankar et al. — the academic grounding for eval design.
* *Anthropic's Building Effective Agents* — the reference for agent architecture patterns.
* *Context Engineering* (Shopify, 2024; see public writeups) — the emerging discipline that elevates context design to first-class engineering.
* *A Survey on Retrieval-Augmented Generation*, multiple authors — RAG ground truth.


# AI-Native (/docs/principles/ai-native)


# AI-Native [#ai-native]

Wordloop is AI-native in two directions: the product runs on AI (transcription, recap, embedding), and the team builds with AI (agents write substantial code, read documentation programmatically, and contribute to reviews). Both directions demand a stance — on how models are integrated, how agents consume our interfaces, and how we keep a human on the hook for outcomes.

<Cards>
  <Card title="AI Engineering" href="/docs/principles/ai-native/ai-engineering" description="Prompt engineering, evaluations, agent design, RAG, and context engineering — the disciplines that make AI features production-grade." />

  <Card title="Agent-Native Systems" href="/docs/principles/ai-native/agent-native-systems" description="Making APIs and docs AI-consumable: MCP, llms.txt, structured metadata, and the interfaces that let agents work alongside humans." />
</Cards>

Related reading: [Hexagonal Architecture](/docs/principles/system-design/hexagonal-architecture) — the structural choice that, more than any other, determines how effectively an agent can contribute to a codebase.


# Cost Engineering (/docs/principles/delivery/cost-engineering)


# Cost Engineering [#cost-engineering]

## TL;DR [#tldr]

Cost is a non-functional requirement with a dashboard and a dollar sign. Every significant architectural decision considers cost-per-user and cost-per-call; every service has a budget it lives inside; surprising spend is an incident. FinOps is how we stay honest about the economics of running what we build.

## Why this matters [#why-this-matters]

Most teams discover cost too late — after a quarterly bill raises eyebrows in a meeting. By then, the decisions that drove the cost are in production, have consumers, and are expensive to reverse. Cost engineering is the discipline of making the economic consequences of decisions visible at the point of the decision. It turns cost from a finance concern into an engineering variable.

## Our principles [#our-principles]

### 1. Cost is a first-class metric [#1-cost-is-a-first-class-metric]

Cost-per-call, cost-per-user, cost-per-feature — all tracked alongside latency and error rate. A feature's success includes its unit economics, not just its engagement numbers. A team that does not know what its features cost cannot reason about trade-offs that matter.

### 2. Budgets are set and defended [#2-budgets-are-set-and-defended]

Every significant service runs inside a cost budget. The budget is set at design time, reviewed monthly, and treated as a commitment. Exceeding budget triggers the same response as exceeding any other SLO: investigate, remediate, or explicitly negotiate an increase.

### 3. Autoscaling is designed, not enabled [#3-autoscaling-is-designed-not-enabled]

Autoscaling is a tool with sharp edges. Aggressive autoscaling on a bursty workload can multiply cost without improving user experience; conservative autoscaling on a steady workload wastes headroom. Each scaling policy is tuned per workload with the production load profile in mind, not set to vendor defaults and left.

### 4. Cheap queries beat fast queries [#4-cheap-queries-beat-fast-queries]

The fastest query is the one that does not run. We cache what we can, compute what we must, and denormalise when the read-to-write ratio justifies it. A cheap-and-fast query is a rare combination; when they conflict, the cheap version is usually the right default.

### 5. Egress is expensive; plan for it [#5-egress-is-expensive-plan-for-it]

Cloud provider egress is the most mispriced line item in most bills. Inter-region chatter, chatty logs, bulky screenshots uploaded constantly — these add up. We place data where its consumers are, batch where we can, and compress where it is cheap to do so.

### 6. AI spend has the same discipline [#6-ai-spend-has-the-same-discipline]

Every model call has a measured cost and a caching strategy. Prompts are versioned with token-count measurement; expensive prompts are justified by value. "Just pass the whole context to the largest model" is how an AI feature becomes a cost incident ([ML Systems](/docs/principles/stack/ml-systems)).

### 7. Reservations and commits where they pay [#7-reservations-and-commits-where-they-pay]

For predictable baseline workloads, reserved instances and committed-use discounts save 30-50% over on-demand. The discipline is to match the reservation to the baseline — over-reserving locks us in, under-reserving wastes the committed spend.

### 8. FinOps is a practice, not an office [#8-finops-is-a-practice-not-an-office]

Cost engineering is something every team does, not a team that does it on behalf of others. The central function provides tooling and visibility; the distributed decisions are made by the teams that built the spend.

## How we apply this [#how-we-apply-this]

* [Observability](/docs/principles/quality/observability) — the measurement substrate for cost per unit.
* [ML Systems](/docs/principles/stack/ml-systems) — the cost discipline for model calls.
* [Platform](/docs/principles/delivery/platform) — the shared infra that every team's cost sits on.
* [Performance](/docs/principles/quality/performance) — cheap code is often also fast code.

## Anti-patterns we reject [#anti-patterns-we-reject]

* **"We will optimise cost later."** Later never comes; the architecture is what it is by then.
* **Autoscale-and-forget.** Default autoscaling on a workload you have not profiled is how you get a thousand-dollar day.
* **Chatty logs forever.** Unstructured debug logs at volume are a non-trivial line on the bill.
* **AI calls without budget.** Model spend without a measured cost-per-request grows silently until it does not.
* **"It's just pennies."** Pennies × N × daily = a real number. Track it.

## Further reading [#further-reading]

* *Cloud FinOps*, Storment & Fuller — the canonical text on cross-functional cost management.
* *The Cost of Complexity*, Frederic Lardinois (various articles) — the essays on why complex architectures cost more than they appear.
* *AWS Well-Architected Framework — Cost Optimization pillar* — applicable beyond AWS, useful as a checklist.
* *FinOps Foundation framework* ([finops.org](https://finops.org)) — the practitioner's handbook.


# Developer Experience (/docs/principles/delivery/devex)


# Developer Experience [#developer-experience]

## TL;DR [#tldr]

A team ships as fast as its feedback loop lets it. We invest deliberately in the inner loop — the seconds between a code change and the evidence that the change works — because every second saved there is paid back a thousand times over across the team. `./dev` is our golden path, DORA metrics are how we measure the loop, and friction in the loop is an engineering bug.

## Why this matters [#why-this-matters]

The single largest predictor of a team's output, over months and years, is the quality of its feedback loop. A team that sees the result of a change in five seconds ships more and ships better than a team that sees it in five minutes — not because the individuals are smarter, but because the loop of hypothesis-and-test runs an order of magnitude more often. Developer experience is not a perk; it is an engineering lever.

## Our principles [#our-principles]

### 1. The inner loop is sacred [#1-the-inner-loop-is-sacred]

The inner loop is the sequence from "I think this code will work" to "yes or no, here is the evidence." We invest in making this loop as short as it can be: incremental compilation, test selection, hot reload, one-command bootstrapping, fast linting. Every second shaved off the inner loop multiplies across every engineer, every day.

### 2. `./dev` is the single entry point [#2-dev-is-the-single-entry-point]

Every local task — start, stop, test, lint, migrate, deploy, generate — runs through `./dev`. One command to remember, one tool to teach a new engineer, one surface to improve. Proliferating ad-hoc scripts in `Makefile`, `package.json`, and `bin/` is how a developer experience becomes a treasure hunt.

### 3. Golden paths, not mandatory paths [#3-golden-paths-not-mandatory-paths]

The golden path is the well-trodden, well-supported way to do a common task. It is the default, and it is the path new engineers and agents follow by default. Deviation is allowed when a task genuinely does not fit, but the deviator pays the cost of their own tooling. Golden paths concentrate investment; mandatory paths breed resentment.

### 4. DORA metrics keep us honest [#4-dora-metrics-keep-us-honest]

Deployment frequency, lead time for changes, change failure rate, mean time to recover — the four DORA metrics are how we measure whether the delivery system is healthy. We track them, surface them, and react to them. A regression in any one of the four is a signal to invest in the loop.

### 5. Onboarding time-to-first-value is a design target [#5-onboarding-time-to-first-value-is-a-design-target]

A new engineer should reach their first local contribution — "I changed something and I can see the change" — in their first day. A new service should reach its first deploy in the first week. These are targets we hold ourselves to, and regressions here are treated as bugs.

### 6. Documentation is part of the loop [#6-documentation-is-part-of-the-loop]

A command you cannot find is a command you do not use. Every `./dev` subcommand has a reference entry, every golden path has a guide, every service has a handbook. The documentation exists so the loop does not depend on tribal memory.

### 7. Local environments match production shape [#7-local-environments-match-production-shape]

The local stack uses the same Postgres version, the same Pub/Sub contract, the same container runtime. "It works on my machine" is eliminated by eliminating the gap between the machines. Emulation over mocks ([Testing](/docs/principles/foundations/testing)) applies here too.

### 8. Friction is filed as a bug [#8-friction-is-filed-as-a-bug]

If a process is painful, that pain is a bug. File it, prioritise it, fix it. "Everyone deals with it" is how chronic friction becomes chronic velocity loss. The developer experience team — or whoever is the local maintainer of `./dev` — owns the backlog the same way a product team owns its user-bug backlog.

## How we apply this [#how-we-apply-this]

* [CLI Reference](/docs/reference/cli) — the surface of `./dev`.
* [Quickstart](/docs/start/quickstart) — the first-contact experience we measure.
* [Platform](/docs/principles/delivery/platform) — the broader internal platform `./dev` is a part of.
* [Progressive Delivery](/docs/principles/delivery/progressive-delivery) — the outer loop the inner loop feeds into.

## Anti-patterns we reject [#anti-patterns-we-reject]

* **"Follow the README and read between the lines."** Onboarding that depends on tacit knowledge is not onboarding.
* **Five CLIs for five tasks.** `./dev` is one. A second CLI earns its existence by solving a problem `./dev` cannot.
* **Skip-the-test culture.** Fast-but-unreliable tests are worse than slow-reliable tests. The inner loop is made fast by honest investment, not by cheating.
* **DORA theatre.** Tracking the metric while not responding to it is worse than not tracking it at all.
* **Ignoring friction.** If you find a sharp edge, file the ticket. Do not route around it silently.

## Further reading [#further-reading]

* *Accelerate*, Forsgren, Humble, Kim — the empirical foundation for DORA metrics.
* *The DevOps Handbook*, Kim et al. — the full treatment of the inner-and-outer loop view.
* *Team Topologies*, Skelton & Pais — the organisational side of platform and golden paths.
* *Developer Experience: Concept and Definition* (Fagerholm & Münch, 2012) — the academic framing that predates the modern DevEx term.


# Delivery (/docs/principles/delivery)


# Delivery [#delivery]

Delivery is the discipline of turning code into running software that users can feel. The pages in this section describe the four practices that determine whether our delivery loop is a source of leverage or a source of toil: developer experience, progressive delivery, platform engineering, and cost engineering.

<Cards>
  <Card title="DevEx" href="/docs/principles/delivery/devex" description="Golden paths, paved roads, inner-loop speed, and DORA metrics as the measure of whether the loop is healthy." />

  <Card title="Progressive Delivery" href="/docs/principles/delivery/progressive-delivery" description="Feature flags, canaries, trunk-based development, and deployment strategies that let us move fast without breaking things." />

  <Card title="Platform" href="/docs/principles/delivery/platform" description="Internal developer platforms, self-service tooling, and the treatment of `./dev` as a product in its own right." />

  <Card title="Cost Engineering" href="/docs/principles/delivery/cost-engineering" description="FinOps, cost-aware architecture, and the economics of autoscaling." />
</Cards>


# Platform (/docs/principles/delivery/platform)


# Platform [#platform]

## TL;DR [#tldr]

The platform is the substrate every application team builds on: the local stack, the CI/CD pipeline, the observability collector, the secrets manager, the IDP that fronts all of it. We treat the platform as a product — it has users (us), a backlog, a quality bar, and explicit investment. A good platform makes the right thing the easy thing.

## Why this matters [#why-this-matters]

Every team in a multi-service organisation eventually arrives at the same realisation: the biggest drag on productivity is not the code the team writes, but the accumulated friction of the common plumbing every project has to assemble. A platform that handles the plumbing well turns that friction into a paved road. A platform that does not becomes a tax every project pays repeatedly. The quality of the platform is a direct multiplier on the output of every engineer on top of it.

## Our principles [#our-principles]

### 1. Platform is a product, with users and a roadmap [#1-platform-is-a-product-with-users-and-a-roadmap]

The people who build the platform have explicit users — the application engineers — and treat their work as a product: backlog, priorities, measurement, feedback. A platform maintained "when we have time" decays; a platform treated as product investment compounds.

### 2. Self-service is the goal [#2-self-service-is-the-goal]

Every common task — spinning up a new service, requesting a secret, adding an OTel dashboard, changing a feature flag — should be self-service. When an application team has to file a ticket and wait for the platform team, the platform is the bottleneck. Self-service is the acid test.

### 3. Golden paths over policy [#3-golden-paths-over-policy]

We paved specific paths — how to create a service, how to deploy, how to observe — and we make those paths the easiest route. Policy documents without paved paths produce compliance in shape but drift in substance.

### 4. `./dev` is the platform's front door [#4-dev-is-the-platforms-front-door]

For local workflows, `./dev` is the abstraction over every underlying tool: Docker, pnpm, uv, Air, migrate. The platform team maintains `./dev`; application teams use it without needing to know what is under it. See [DevEx](/docs/principles/delivery/devex).

### 5. One paved-road CI pipeline [#5-one-paved-road-ci-pipeline]

One pipeline definition for every Go service; one for every Python service; one for every TypeScript service. Teams that deviate earn the cost of maintaining their own pipeline. This is how we prevent snowflake CI configurations from accumulating.

### 6. Observability is part of the platform [#6-observability-is-part-of-the-platform]

Traces, metrics, and logs flow through the same collector, into the same backend, onto the same dashboards. Observability set up by each team independently ([Observability](/docs/principles/quality/observability)) is observability broken in five different ways.

### 7. The platform gets the same scrutiny as the product [#7-the-platform-gets-the-same-scrutiny-as-the-product]

Platform code is reviewed, tested, versioned, and deployed the same way product code is. A broken platform release can hurt every team at once, so the bar is actually higher. "It is just tooling, ship it" is how a platform becomes an obstacle.

### 8. Measure what the users feel [#8-measure-what-the-users-feel]

Platform success is measured by the application teams' outcomes — DORA metrics, onboarding time, number of tickets filed against the platform. Not by the platform team's own output metrics, which can be excellent while the users are miserable.

## How we apply this [#how-we-apply-this]

* [CLI Reference](/docs/reference/cli) — the `./dev` surface.
* [DevEx](/docs/principles/delivery/devex) — the developer-facing experience the platform enables.
* [Observability](/docs/principles/quality/observability) — the centralised telemetry substrate.
* [Progressive Delivery](/docs/principles/delivery/progressive-delivery) — the CI/CD pipeline as a platform service.

## Anti-patterns we reject [#anti-patterns-we-reject]

* **Platform-as-gatekeeper.** A platform that says "no" more than it says "self-serve" is a bottleneck, not a platform.
* **Five ways to do one thing.** Historical pipelines that nobody cleaned up. The platform should consolidate.
* **Tooling that only the platform team can use.** If the API requires insider knowledge, the tool is incomplete.
* **"Platform investment later."** The platform is either invested in or decaying; there is no steady state.
* **Metrics for the platform's sake.** Measuring "tickets closed by platform team" without measuring application-team outcomes misses the point.

## Further reading [#further-reading]

* *Team Topologies*, Skelton & Pais — the canonical framing of platform teams and enabling teams.
* *Platform Engineering on Kubernetes*, Mauricio Salatino — the practical engineering view.
* *The DevOps Handbook*, Kim et al. — the broader cultural context the platform sits inside.
* *Backstage documentation* ([backstage.io](https://backstage.io)) — the archetype of an internal developer portal.


# Progressive Delivery (/docs/principles/delivery/progressive-delivery)


# Progressive Delivery [#progressive-delivery]

## TL;DR [#tldr]

Progressive delivery is how we decouple the act of deploying code from the act of releasing a feature. We ship to production multiple times a day from a single branch, but users see changes only when we open a flag, route a canary, or promote a cohort. The production environment is stable; the user experience is controlled independently.

## Why this matters [#why-this-matters]

The reason most teams avoid shipping often is that shipping carries risk — a bad deploy can break production for every user at once. Progressive delivery breaks the link. A deploy puts the code into production. A release makes the code reach users. With the two decoupled, deploys become small, frequent, and boring; releases become observable, controllable, and reversible. That asymmetry is how modern teams sustain a fast release cadence without a proportional rate of incidents.

## Our principles [#our-principles]

### 1. Trunk-based development with short-lived branches [#1-trunk-based-development-with-short-lived-branches]

Every change lands on `main` as soon as it is ready. Branches measured in days, not weeks. Long-lived branches are how integration bugs accumulate quietly; trunk-based development surfaces them constantly, which makes them cheap to fix.

### 2. Deploy on every merge [#2-deploy-on-every-merge]

Main is always deployable, and we deploy from it continuously. A merged PR reaches production within the deploy window — not hours or days later. This is enforced by automation; a team that relies on a human "release engineer" has already lost the bet on cadence.

### 3. Feature flags separate deploy from release [#3-feature-flags-separate-deploy-from-release]

A new feature is deployed behind a flag, defaulted off. The flag state decides who sees the feature — nobody, internal users, a cohort, everyone. A bad feature is disabled without a redeploy; a controversial feature is rolled to 1% before 100%. Flags are a core primitive, not a third-party dependency.

### 4. Canary before promote [#4-canary-before-promote]

Every release that could affect latency, reliability, or user experience goes through a canary — a small fraction of traffic for a bounded window — before promoting. Canary signals (error rate, p99 latency, user journey success) are automated comparisons, not eyeballs on a dashboard.

### 5. Release is reversible, cheaply [#5-release-is-reversible-cheaply]

Every release has a rollback path that can be executed in a few minutes by any on-call engineer. Database migrations are designed reversibly ([Migrate the Schema](/docs/guides/migrate-schema)); flags can be flipped; canaries can be re-routed. "We can't roll that back" is a red flag on the release itself.

### 6. Flag hygiene is continuous [#6-flag-hygiene-is-continuous]

Flags are an asset and a debt. A long-lived flag that nobody remembers the purpose of is a drag on every future change. Every flag has an owner, a purpose, and an expiry date; stale flags are removed in the normal course of work.

### 7. Observability defines "healthy" [#7-observability-defines-healthy]

A release is healthy when the relevant user-journey SLOs are within tolerance ([Reliability](/docs/principles/quality/reliability)). Not when CPU is low, not when memory is steady — when users' journeys are succeeding at the rate they did before. The canary is evaluated against SLO burn rates.

### 8. The release story is the same for every service [#8-the-release-story-is-the-same-for-every-service]

One rollout model, one flag system, one canary pattern. Different services with different release mechanics multiply cognitive load and reduce the effectiveness of the on-call engineer. Consistency is a force multiplier.

## How we apply this [#how-we-apply-this]

* [DevEx](/docs/principles/delivery/devex) — the inner loop that feeds into continuous delivery.
* [Reliability](/docs/principles/quality/reliability) — the SLO surface that gates canary promotion.
* [Observability](/docs/principles/quality/observability) — the signal layer for release health.
* [Deploy](/docs/guides/deploy) — the canonical deploy workflow.

## Anti-patterns we reject [#anti-patterns-we-reject]

* **Release trains.** Batching up a month of changes and shipping them on Friday is how you get a huge, unreviewable deploy that breaks in ways nobody can localise.
* **Flags without expiry.** A flag that has been "temporary" for a year is permanent — and a permanent decision hidden inside a runtime config.
* **Canary-by-eyeball.** Promoting because the graph "looks fine" is a coin flip. Automate the comparison.
* **"We will test it in staging."** Staging has no users. A canary in production is the only test of production behaviour.
* **Commit-and-hope.** No canary, no flag, deploy to 100%. You will find out in the morning.

## Further reading [#further-reading]

* *Accelerate*, Forsgren, Humble, Kim — the data on trunk-based development and its outcomes.
* *Continuous Delivery*, Humble & Farley — the canonical treatment of the release pipeline.
* James Governor, *Progressive Delivery* (RedMonk, 2018) — the essay that named the practice.
* *The Release It! Second Edition*, Michael Nygard — the stability-pattern view of rollout.


# Code Craft (/docs/principles/foundations/code-craft)


# Code Craft [#code-craft]

## TL;DR [#tldr]

Code is read far more than it is written. Our craft is to write code that the next reader — human or agent — can understand, change, and delete with confidence. Simplicity is the default; abstraction is a cost that must be earned.

## Why this matters [#why-this-matters]

In a codebase that is alive for more than a year, the dominant cost is not writing code — it is understanding the code already there so you can change it. Every abstraction, every layer of indirection, every "flexible" interface is a tax on future readers. Our stance is that taxes must be justified. When we optimise for future flexibility we have not yet needed, we pay a certain cost today against an uncertain benefit later; more often than not, the benefit never arrives and we are left with the cost.

## Our principles [#our-principles]

### 1. Simpler is better than clever [#1-simpler-is-better-than-clever]

A function that a tired engineer can understand in thirty seconds is worth more than a function that demonstrates the author's taste in type systems. Prefer plain data structures over clever abstractions, plain control flow over meta-programming, plain naming over in-joke naming. When "clever" and "clear" conflict, clear wins.

### 2. No speculative abstraction [#2-no-speculative-abstraction]

Do not build a generalisation until you have at least three concrete use cases driving the same shape. Premature abstractions are harder to change than the duplication they replace — because now you have to understand the abstraction, the use cases, and the compatibility between them before you can change any of them. Three similar lines of code is almost always better than a half-designed helper.

### 3. Deletion is a virtue [#3-deletion-is-a-virtue]

The code you delete cannot break, cannot require maintenance, cannot confuse the next reader, and cannot leak a vulnerability. When a feature is removed, the code should go with it — including the tests, the config flags, and the docs. Leaving dead code "just in case" is a bet that is almost always wrong: if we need it back, we will write a clearer version with the benefit of hindsight.

### 4. Names are the interface [#4-names-are-the-interface]

A badly named function is a broken interface even if its behaviour is correct, because every caller has to read the implementation to know what it does. We spend time on names. We rename aggressively when a better name becomes clear. Variables, functions, types, files, directories — all of them communicate, and a mismatch between name and behaviour is a bug.

### 5. Comments explain the "why" [#5-comments-explain-the-why]

Code explains the "what" — the comment is redundant. Names explain the "who" and "where." The only thing left for a comment is the "why": the non-obvious constraint, the invariant that must hold, the bug that drove an odd choice, the reference to an ADR. If a comment would be obvious to anyone who read the surrounding code, it is noise.

### 6. Error handling is design, not decoration [#6-error-handling-is-design-not-decoration]

Errors are a first-class part of the interface, not an afterthought. We decide — explicitly — which errors a function can return, how callers are expected to respond, and where the boundary between recoverable and fatal is. `err != nil` sprinkled through a codebase without a model behind it is a failure of design.

### 7. Trust the boundary; distrust the internal [#7-trust-the-boundary-distrust-the-internal]

We validate at system boundaries — user input, external APIs, message payloads — where the data is untrusted. We do not re-validate between internal callers in the same service; if an internal contract is wrong, the right fix is the contract, not a runtime check in every consumer. Defensive programming inside the trust boundary is a form of noise.

### 8. Dead code is a bug [#8-dead-code-is-a-bug]

Commented-out code, `_unused` variables, orphan functions, legacy configuration — all of it decays the signal-to-noise ratio of the codebase. When we find it, we delete it. `git` preserves anything we lose; the working tree should contain only code that is alive today.

## How we apply this [#how-we-apply-this]

* [Hexagonal Architecture](/docs/principles/system-design/hexagonal-architecture) — the structural discipline that makes simplicity scalable.
* [Testing](/docs/principles/foundations/testing) — tests that exercise behaviour keep refactoring cheap.
* [Go Services](/docs/principles/stack/go-services) — the idioms that keep our Go code readable.
* [Frontend](/docs/principles/stack/frontend) — the conventions that keep our React code readable.
* [Decisions](/docs/decisions) — the ADRs that capture the "why" our comments do not.

## Anti-patterns we reject [#anti-patterns-we-reject]

* **Defensive programming without a threat model.** Guarding every internal call against nil is not robustness — it is distrust of our own type system.
* **"Might need it later" scaffolding.** Config flags for scenarios that do not exist, plugin systems with one plugin, interfaces with one implementation. Delete.
* **Fashion-driven refactors.** Rewriting working code to match a new pattern the team read about this week is debt, not progress.
* **Multi-paragraph docstrings.** If the function needs a multi-paragraph docstring to be understood, the function is wrong. Split it, rename it, or simplify it — then the docstring is not needed.
* **Backwards-compatibility shims for internal APIs.** If it is fully internal, changing it is allowed and expected; compatibility layers are debt we impose on ourselves for no benefit.

## Further reading [#further-reading]

* *A Philosophy of Software Design*, John Ousterhout — deep-module principle, the cost of shallow abstractions.
* *Tidy First?*, Kent Beck — the economics of refactoring as a separable activity.
* *The Pragmatic Programmer*, Hunt & Thomas — the canonical treatment of names, duplication, and orthogonality.


# Documentation (/docs/principles/foundations/documentation)


# Documentation [#documentation]

## TL;DR [#tldr]

Documentation is an active product surface. Wordloop docs are the canonical source for durable engineering knowledge; agent skills are the execution layer that selects, loads, and applies that knowledge safely. We design documentation for humans and AI agents at the same time, organise it with Diátaxis, expose it through `llms.txt`, Markdown exports, and MCP, and enforce freshness with automation wherever a human would drift.

## Why this matters [#why-this-matters]

In 2026, documentation is part of the runtime environment for engineering work. A human reads the site through navigation and search; an agent reads the same knowledge through MCP resources, `llms.txt`, `llms-full.txt`, and per-page Markdown exports. If those surfaces disagree, the system teaches different readers different truths. That is not a documentation problem; it is an engineering defect.

The operating model is simple: **docs hold the knowledge, skills control the agent behaviour**. Durable guidance belongs in the docs site where humans and agents can inspect it. Skill files stay concise and directive: they define when to trigger, what context to load, which tools to use, and which safety checks must run. This keeps prompts lean, reduces duplicated policy, and gives us one canonical place to correct factual drift.

## Our principles [#our-principles]

### 1. Documentation is canonical knowledge [#1-documentation-is-canonical-knowledge]

Architecture principles, service handbooks, workflow guides, glossary terms, ADRs, API references, and generated schemas belong in the docs site. A skill may point to these pages, but it does not become the source of truth for material that humans also need to understand.

### 2. Skills are the agent execution layer [#2-skills-are-the-agent-execution-layer]

Agent skills are a control surface, not a second documentation site. A skill owns triggering, task routing, tool use, safety constraints, verification steps, and context-loading instructions. It should say, for example, "read the App service handbook before changing `wordloop-app` data fetching," not duplicate the handbook in full.

### 3. AI-native documentation is first class [#3-ai-native-documentation-is-first-class]

Every important documentation surface must survive machine consumption. We publish `llms.txt` as the curated index, `llms-full.txt` as the consolidated corpus, `.md` exports for individual pages, and MCP resources for structured retrieval. Agent-readiness is not an afterthought or an SEO trick; it is a quality attribute of the docs system.

### 4. Diátaxis is the structural frame [#4-diátaxis-is-the-structural-frame]

We organise by reader intent, not by our internal org chart. Tutorials teach, how-to guides solve, reference pages support lookup, and explanation pages build understanding. A page that mixes these jobs forces both humans and agents to infer the purpose from context, which makes retrieval weaker and maintenance harder.

### 5. Active docs replace passive docs [#5-active-docs-replace-passive-docs]

A page is not "done" when it is written. Active docs declare ownership, review cadence, freshness status, and source-of-truth boundaries. Pages that age past their review window are visibly flagged and reviewed as part of normal engineering work, not as a cleanup project.

### 6. Automation is the first reviewer [#6-automation-is-the-first-reviewer]

Automated checks enforce the cheap, high-signal rules: required frontmatter, broken internal links, stale review dates, invalid skill-to-doc references, stale generated corpora, and known version mismatches. Humans review accuracy, judgment, and usefulness. Automation handles the facts it can verify without fatigue.

### 7. Prefer generated reference over prose [#7-prefer-generated-reference-over-prose]

API specs, event contracts, database schemas, CLI command tables, and error catalogues have machine-readable sources. We render them from those sources instead of hand-writing reference pages. Hand-written reference material drifts; generated reference material can be rebuilt and checked.

### 8. Decisions are append-only [#8-decisions-are-append-only]

Hard-to-reverse decisions live in ADRs. Accepted ADRs are not edited to match current preference; they are superseded. Each ADR carries enough consequence and debt context for a future reader to understand why the decision existed, what it cost, and when to revisit it.

### 9. Metadata interoperability matters [#9-metadata-interoperability-matters]

Formal documentation standards are useful when they sharpen interoperability discipline. ISO/PAS 25955:2026 is a Publicly Available Specification for Data Documentation Initiative interoperability, not a generic agent-documentation linking standard. The lesson we apply is precise metadata, stable identifiers, and explicit relationships between documentation objects. For agent discovery specifically, Wordloop uses `llms.txt`, Markdown exports, MCP resources, and HTTP `Link` headers.

### 10. Drift is corrected at the source [#10-drift-is-corrected-at-the-source]

When code, docs, skills, specs, and design records disagree, we identify the source of truth before editing. Code and generated contracts win for shipped runtime behaviour. ADRs win for historical decisions. Active design docs win for current delivery intent until the shipped system proves otherwise. Skills win for agent execution behaviour only.

## Freshness model [#freshness-model]

| Surface                 |                        Review window | Freshness rule                                                                      |
| ----------------------- | -----------------------------------: | ----------------------------------------------------------------------------------- |
| Principles              |                             6 months | Review when operating model or engineering policy changes.                          |
| Service handbooks       |                             3 months | Review when code structure, stack versions, commands, or service boundaries change. |
| API and event reference |                Every contract change | Generated from OpenAPI and AsyncAPI sources.                                        |
| Runbooks                |                             3 months | Review after incidents, operational changes, or ownership changes.                  |
| Active bet and TDD docs | Every material implementation change | Keep design intent aligned with delivery reality.                                   |
| Delivered bet docs      |                           Historical | Freeze except for explicit correction notes.                                        |
| ADRs                    |                           Historical | Supersede instead of rewriting accepted records.                                    |
| Agent skills            |    Every skill or mapped docs change | Validate trigger logic, context routing, and verification steps.                    |

See [Documentation Freshness](/docs/operations/documentation-freshness) for the operational policy.

## How we apply this [#how-we-apply-this]

* [llms.txt](/llms.txt) and [llms-full.txt](/llms-full.txt) are the machine-readable entry points.
* [Agent-Native Systems](/docs/principles/ai-native/agent-native-systems) defines the broader interface discipline for agent consumers.
* [Keep Docs and Skills in Sync](/docs/guides/keep-docs-and-skills-in-sync) defines the change workflow for canonical docs and skill files.
* [Correct Documentation Drift](/docs/guides/correct-documentation-drift) defines the triage workflow when docs, skills, code, specs, and design records disagree.
* [Decisions](/docs/decisions) records architectural decisions with append-only history.
* [Reference](/docs/reference) contains generated and lookup-oriented material.

## Anti-patterns we reject [#anti-patterns-we-reject]

* **Skill files as shadow docs.** A skill that duplicates durable engineering policy becomes stale faster than the canonical docs page.
* **Docs pages as prompts.** Documentation should explain systems and decisions; skills should instruct agents how to act.
* **Documentation as an afterthought.** Docs ship with the feature or the feature is incomplete.
* **Manual reference tables.** If a table can be generated from code, contracts, or schemas, generate it.
* **Unowned pages.** A page without owner and review cadence has no maintenance path.
* **Stale diagrams.** A diagram that does not match the system is worse than no diagram because it creates false confidence.
* **Screenshots as reference.** Screenshots are acceptable as evidence in incidents, not as canonical UI or architecture documentation.
* **Marketing-flavoured engineering docs.** Assertions need evidence, examples, or source-of-truth links.
* **Overstated standards claims.** Distinguish formal standards from emerging conventions. Name the standard, its scope, and why it applies.

## Further reading [#further-reading]

* [Diátaxis](https://diataxis.fr) — the structural model for tutorials, how-to guides, reference, and explanation.
* [llms.txt](https://llmstxt.org) — the emerging convention behind our AI-readable documentation index.
* [Model Context Protocol](https://modelcontextprotocol.io) — the protocol we use for structured agent access to docs resources and tools.
* [ISO/PAS 25955:2026](https://www.iso.org/standard/92127.html) — DDI interoperability specification; useful as a metadata-interoperability reference, not as an agent-discovery standard.
* *Docs for Developers*, Bhatti et al. — practical guidance for engineering documentation.
* *Living Documentation*, Cyrille Martraire — using code and automation to reduce documentation drift.


# Foundations (/docs/principles/foundations)


# Foundations [#foundations]

Foundations are the ideas that shape our engineering before any specific stack, service, or feature enters the conversation. They are deliberately stack-agnostic — the same principles should hold whether we are writing Go, Python, or TypeScript, whether the target is a backend API or a frontend surface, whether the change is large or small.

Four pages live here:

<Cards>
  <Card title="Product Engineering" href="/docs/principles/foundations/product-engineering" description="Outcomes over outputs, shaped work, appetite-based planning, and the relationship between engineering and user impact." />

  <Card title="Code Craft" href="/docs/principles/foundations/code-craft" description="Simplicity, readability, the discipline of deletion, and the refusal to build for hypothetical futures." />

  <Card title="Documentation" href="/docs/principles/foundations/documentation" description="Docs as a product, dual-audience architecture, debt-aware ADRs, and diagrams-as-code." />

  <Card title="Testing" href="/docs/principles/foundations/testing" description="Continuous Risk Assurance — testing the system, not the mock of the system." />
</Cards>

Read these before reading anything else in the principles hub. They are the filter through which every subsequent decision makes sense.


# Product Engineering (/docs/principles/foundations/product-engineering)


# Product Engineering [#product-engineering]

## TL;DR [#tldr]

We are product engineers before we are coders. Our job is to move user outcomes — not to ship tickets. Work is shaped before it is scheduled, scheduled against a fixed appetite rather than an estimate, and measured by the change it makes in user behaviour rather than the volume of code it produces.

## Why this matters [#why-this-matters]

The dominant failure mode of engineering teams in 2026 is not technical debt — it is building the wrong thing well. Feature factories optimise cycle time and output velocity and end up with a product surface that grows faster than the value it delivers. Product engineering is the discipline of resisting that. It says the unit of work is a user outcome, the unit of planning is an appetite, and the test of a PR is whether a real user can feel it.

## Our principles [#our-principles]

### 1. Outcomes over outputs [#1-outcomes-over-outputs]

An "output" is a feature shipped, a ticket closed, a migration completed. An "outcome" is a change in what a user can do, how quickly they can do it, or how reliably the system supports them. We plan around outcomes and let outputs be whatever shape is required to deliver them. A sprint ending with three closed tickets and no user-visible outcome is a sprint of failed work.

### 2. Shape work before scheduling it [#2-shape-work-before-scheduling-it]

No work enters a sprint without having been *shaped*: the problem stated in user terms, the rough solution sketched, the boundaries drawn to exclude rabbit holes. Shaped work is expensive upfront and cheap downstream. Unshaped work is the single biggest source of mid-sprint drift, scope creep, and late-breaking discovery that the whole approach was wrong.

### 3. Appetite, not estimate [#3-appetite-not-estimate]

We set an *appetite* — "this is worth about two weeks of one engineer's attention" — and then design a solution that fits inside it. If it cannot fit, we either reduce scope or reject the work. This inverts the usual flow: instead of estimating the cost of a fixed solution, we fix the cost and negotiate the solution. It forces the team to ask "what is the cheapest version of this that delivers the outcome?" and it kills the tendency of work to expand to the time available.

### 4. Kill your darlings [#4-kill-your-darlings]

If a feature is not moving an outcome, we remove it. Deletion is the most under-used tool in a product engineer's kit. Every line of code, every page of docs, every dashboard tile, every CLI flag that does not pay for its maintenance cost should be cut. A smaller, sharper product is cheaper to operate and easier for the next engineer to understand.

### 5. Instrument everything you ship [#5-instrument-everything-you-ship]

A feature that is not measured does not exist from a product engineering point of view. We decide the signal *before* we ship — event, dashboard, success criterion — and we check the signal after release. If we cannot measure it, we negotiate the feature until we can.

## How we apply this [#how-we-apply-this]

* [Run Tests](/docs/guides/run-tests) — we test the outcome, not the implementation.
* [Progressive Delivery](/docs/principles/delivery/progressive-delivery) — canaries and flags are the mechanism by which we measure outcomes safely.
* [Observability](/docs/principles/quality/observability) — the signal layer that makes outcome-based engineering possible.
* [Decisions](/docs/decisions) — the record of shaping decisions that cost us real time.

## Anti-patterns we reject [#anti-patterns-we-reject]

* **Velocity-as-KPI.** Story points per sprint measure nothing about user outcomes. Optimising for it corrupts the team.
* **Estimate-driven planning.** Estimates anchor on how long the team thinks work will take, not on how much the work is worth. We use appetites instead.
* **"Build it and they will come."** Launching a feature without a measurement plan is a signal that no one owns the outcome.
* **Technical-debt-for-its-own-sake projects.** Refactors without a user-visible payoff are a smell; wrap them inside an outcome that demands them.

## Further reading [#further-reading]

* *Shape Up*, Ryan Singer — the canonical treatment of shaped work and fixed appetites.
* *Inspired*, Marty Cagan — the product-engineering triad and its implications for how teams are built.
* *Escaping the Build Trap*, Melissa Perri — why feature-factory metrics corrupt outcomes.


# Testing (/docs/principles/foundations/testing)


# Testing [#testing]

## TL;DR [#tldr]

Tests are risk-weighted assertions about production behaviour — not boxes ticked for coverage. We favour high-fidelity service tests over solitary unit tests, emulate dependencies rather than mocking them, and treat observability signals as first-class test assertions.

## Why this matters [#why-this-matters]

The dominant failure mode of a test suite in 2026 is not that it is too small — it is that it passes while production breaks. Mocked dependencies drift from their real counterparts, unit tests assert on implementation rather than behaviour, and green CI gives a false sense of security. *Continuous Risk Assurance* is our name for the discipline that replaces "coverage as a target" with "risk as the thing we actually measure."

## Our principles [#our-principles]

### 1. Favour service tests over solitary unit tests [#1-favour-service-tests-over-solitary-unit-tests]

The "sociable" service test is our foundational unit of validation. We test from the API entry point through to real, ephemeral database containers. We reserve solitary unit tests exclusively for complex isolated algorithms (parsers, validators, pure computation). In a service-oriented codebase, the interesting bugs live at the boundaries — HTTP serialisation, SQL query correctness, event emission — and those are exactly what solitary unit tests mock away.

### 2. Emulate, don't mock [#2-emulate-dont-mock]

If a dependency can run in a container — Postgres, Pub/Sub, object storage — we emulate it via Testcontainers or equivalent. In-memory fakes miss critical data-integrity, serialisation, and networking issues. The startup cost is strictly worth the confidence gain; these are precisely the bugs that escape to production when you mock them out. Emulators are reset per test suite to maintain determinism and prevent test pollution.

### 3. Observability is a test surface [#3-observability-is-a-test-surface]

OpenTelemetry instrumentation is a design-time concern, not an afterthought. System tests assert that traces are unbroken end-to-end: a missing span, a lost TraceID, or a broken parent-child relationship is a test failure, not an instrumentation TODO. The boundary between "test" and "monitor" dissolves — both are asking whether the system is behaving as we claim.

### 4. Name tests by behaviour, not implementation [#4-name-tests-by-behaviour-not-implementation]

Every test follows a BDD-style name: `[Function] should [expected outcome] when [condition]`. This ensures the test log alone tells the story: an on-call engineer reading a failure can form a hypothesis without opening the test code. Names like `TestCreateLoop_Success` are banned — they convey nothing beyond what already appears on the dashboard.

### 5. Risk-based depth, not blanket coverage [#5-risk-based-depth-not-blanket-coverage]

Coverage percentages are meaningless without proof that the assertions catch real faults. We score modules using a risk matrix — Impact × Complexity × Change-frequency — before deciding on test depth. High-risk modules earn live system tests and chaos experiments; low-risk modules need only small tests and static analysis. Equal test depth everywhere is wasted effort.

### 6. Tests are part of the change, not after it [#6-tests-are-part-of-the-change-not-after-it]

A PR without tests is incomplete. A test added in a follow-up PR is a test that will never be written. We write tests alongside the code they verify, and we review the test with the same rigour as the code. If a change resists testing, that is a signal about the design of the code, not the design of the test.

## How we apply this [#how-we-apply-this]

* [Run Tests](/docs/guides/run-tests) — how to invoke the suites locally and in CI.
* [Observability](/docs/principles/quality/observability) — the OTel-first stance that makes traces-as-assertions possible.
* [Reliability](/docs/principles/quality/reliability) — how tests compose with chaos and load experiments.
* [Hexagonal Architecture](/docs/principles/system-design/hexagonal-architecture) — the structural choice that makes tests cheap to write and fast to run.

## Anti-patterns we reject [#anti-patterns-we-reject]

* **Mocking the database.** A test that mocks the database is a test that asserts against your SQL-writing skill, not against database behaviour. Use an ephemeral container.
* **Snapshot tests as a default.** Snapshots are a brittle, noisy substitute for behavioural assertions. They are acceptable only when the thing being snapshotted is a genuinely opaque artefact (a rendered email, a serialised response).
* **Coverage-gated CI.** "95% line coverage required" is a metric that can be gamed without improving real risk reduction. Use it as a read-out, never as a gate.
* **Shared staging environments as the integration test.** Staging has no hermetic guarantees, no reproducibility, and no determinism. It is a deployment target; it is not a test bed.
* **"It's hard to test, so we didn't."** That is a signal the code is badly designed. Fix the code.

## Further reading [#further-reading]

* *Accelerate*, Forsgren, Humble, Kim — the empirical case for continuous delivery and its testing discipline.
* *Working Effectively with Legacy Code*, Michael Feathers — seams, test doubles, and when each is appropriate.
* *Growing Object-Oriented Software, Guided by Tests*, Freeman & Pryce — the canonical treatment of outside-in service testing.
* *xUnit Test Patterns*, Gerard Meszaros — the vocabulary we use for test doubles, fixtures, and strategies.


# Accessibility (/docs/principles/quality/accessibility)


# Accessibility [#accessibility]

## TL;DR [#tldr]

Every user interface we ship meets WCAG 2.2 AA as a baseline. Keyboard, screen reader, and visual assistive technology are first-class targets, not after-launch polish. A feature that does not work for a keyboard user or a screen-reader user is not finished.

## Why this matters [#why-this-matters]

Accessibility is not a niche concern — a significant fraction of our users rely on assistive technology at some point. Beyond the moral case (equal access is a baseline), the design constraints that accessibility imposes — clear hierarchy, visible focus, semantic structure, predictable navigation — tend to produce better software for *every* user. An accessible interface is almost always also a clearer, calmer interface.

## Our principles [#our-principles]

### 1. WCAG 2.2 AA is the floor, not the ceiling [#1-wcag-22-aa-is-the-floor-not-the-ceiling]

We conform to WCAG 2.2 AA for every page, every component, every release. AA is the baseline, and we aim for AAA on critical journeys where the cost is bearable. Falling below AA is a bug; it is not a trade-off we make.

### 2. Keyboard first [#2-keyboard-first]

Every interactive element is reachable and usable with the keyboard. Tab order is logical, focus is always visible, and there are no keyboard traps. The design test is simple: can a power user — or a user who cannot use a mouse — complete every journey without touching the pointer?

### 3. Screen readers see what sighted users see [#3-screen-readers-see-what-sighted-users-see]

Semantic HTML first; ARIA only when HTML is not expressive enough. Headings form an outline, landmarks mark regions, form fields carry labels, images carry alt text, live regions announce updates. A screen reader should produce a narrative that matches what a sighted user sees — not a richer or poorer version of it.

### 4. Colour is never the only signal [#4-colour-is-never-the-only-signal]

A red error, a green success, a blue link — each one is accompanied by a label, an icon, or a structural cue. Colour-blind users exist; colour-only signalling is an exclusion.

### 5. Motion is optional [#5-motion-is-optional]

Animations respect `prefers-reduced-motion`. Large-scale parallax and aggressive transitions are used sparingly; for users with vestibular conditions, unrequested motion is not decoration, it is an accessibility failure.

### 6. Live regions are used sparingly and correctly [#6-live-regions-are-used-sparingly-and-correctly]

Real-time updates — transcription chunks appearing, participants joining — are announced via `aria-live` when they matter to the user's understanding. But over-announcement is as bad as under-announcement; noisy announcements make screen readers ignore the ones that matter.

### 7. Testing is multi-layered [#7-testing-is-multi-layered]

We run automated accessibility checks in CI (axe, Lighthouse accessibility audits), keyboard-walk every new journey manually, and run screen-reader walkthroughs on major features. Automated testing catches the common failures; humans catch the semantic ones.

### 8. Accessibility is reviewed like code [#8-accessibility-is-reviewed-like-code]

Accessibility issues are tracked, owned, and closed the same way any other bug is. The backlog does not accumulate "we will get to the a11y later" — that queue grows forever. Every PR author is expected to include the accessibility check in their definition-of-done.

## How we apply this [#how-we-apply-this]

* [Frontend](/docs/principles/stack/frontend) — the component-library patterns that make accessibility default.
* [App Service Handbook](/docs/learn/services/app) — the wordloop-app architectural view.
* [DevEx](/docs/principles/delivery/devex) — the CI gates that block accessibility regressions.
* [Performance](/docs/principles/quality/performance) — related budgets that compound with accessibility.

## Anti-patterns we reject [#anti-patterns-we-reject]

* **Placeholder text as label.** The placeholder disappears when the field is filled; the label is gone. Users who come back to check the field see nothing. Use a visible label.
* **`<div>` as button.** A `div` with an `onClick` is invisible to keyboard, screen reader, and user agent. Use `<button>`.
* **Tiny click targets.** A button smaller than \~44px square is difficult on touch devices and punishing for users with motor impairments.
* **Focus-removal for aesthetics.** `outline: none` without a replacement focus style breaks keyboard navigation entirely.
* **"We will add a11y in v2."** v2 will not have it either. Build it in.
* **Modals without focus management.** Trap focus inside the modal; restore focus when it closes. Otherwise keyboard users are lost.

## Further reading [#further-reading]

* *WCAG 2.2* ([w3.org/WAI/WCAG22](https://www.w3.org/WAI/WCAG22)) — the normative standard.
* *Inclusive Components*, Heydon Pickering — the canonical pattern language for accessible UI components.
* *ARIA Authoring Practices Guide* ([w3.org/WAI/ARIA/apg](https://www.w3.org/WAI/ARIA/apg)) — the reference for every ARIA pattern.
* *Accessibility for Everyone*, Laura Kalbag — the short introduction for engineers who need to learn the landscape quickly.


# Quality (/docs/principles/quality)


# Quality [#quality]

Quality is not a phase and it is not a team. It is a set of commitments that every engineer makes every day. The pages in this section articulate those commitments for the six dimensions that compound most heavily when ignored: reliability, observability, performance, security, privacy, and accessibility.

<Cards>
  <Card title="Reliability" href="/docs/principles/quality/reliability" description="SRE fundamentals, graceful degradation, circuit breakers, and how we design systems that stay up under load and failure." />

  <Card title="Observability" href="/docs/principles/quality/observability" description="OpenTelemetry-first design, SLOs, error budgets, and trace-driven development." />

  <Card title="Performance" href="/docs/principles/quality/performance" description="Latency budgets, tail latency, backpressure, and load shedding." />

  <Card title="Security" href="/docs/principles/quality/security" description="Zero-trust, threat modeling, SLSA supply-chain integrity, and the secure SDLC." />

  <Card title="Privacy" href="/docs/principles/quality/privacy" description="Data minimisation, GDPR, PII handling, and data residency for a platform that processes meeting audio." />

  <Card title="Accessibility" href="/docs/principles/quality/accessibility" description="WCAG 2.2 AA, keyboard-first design, screen-reader flows, and inclusive UX as a baseline, not a stretch goal." />
</Cards>


# Observability (/docs/principles/quality/observability)


# Observability [#observability]

## TL;DR [#tldr]

Observability is a design property, not a monitoring bolt-on. We instrument every service with OpenTelemetry from day one, build dashboards from the instrumentation, and use traces as both a debugging tool and a first-class test assertion. If a system is behaving strangely and we cannot see why in our data, the instrumentation — not the guessing — is what we fix.

## Why this matters [#why-this-matters]

The difference between a team that can ship with confidence and one that cannot is, most of the time, a difference in what they can see. Observability gives a team three things: the ability to know whether the system is healthy, the ability to localise a fault when it is not, and the ability to explain what happened after the fact. Without those, every deploy is a gamble and every incident is a fresh investigation. With them, the team moves faster and sleeps better.

## Our principles [#our-principles]

### 1. OpenTelemetry is the common language [#1-opentelemetry-is-the-common-language]

Every service emits traces, metrics, and logs through OpenTelemetry SDKs to a single collector. Vendor lock-in at the collector boundary, not inside application code. Switching backends is a collector configuration change, not an application rewrite.

### 2. Traces are the primary signal [#2-traces-are-the-primary-signal]

Given a choice between adding a metric or enriching a trace, we enrich the trace. Traces preserve causality; metrics aggregate it away. For a system where one user action traverses half a dozen services, causality is the difference between a diagnosable incident and a guessing game.

### 3. The "three pillars" are one pillar [#3-the-three-pillars-are-one-pillar]

Logs, metrics, and traces are not independent data — they are different projections of the same events. A log line includes its trace ID; a metric includes the dimensions that let you pivot back to traces; an exemplar on a metric points directly at the trace that produced it. If a team has three disconnected telemetry systems, it has no observability.

### 4. Dashboards derive from SLOs [#4-dashboards-derive-from-slos]

Every dashboard starts with the user-journey SLO it supports ([Reliability](/docs/principles/quality/reliability)). Then latency percentiles, error rates, saturation, and traffic — the "RED/USE" layers — filling in detail. Dashboards assembled by adding "interesting-looking" graphs drift into uselessness; dashboards derived from SLOs stay useful.

### 5. Trace-driven development [#5-trace-driven-development]

When building a new feature, we sketch the trace it should produce *before* we write the handler. What spans must exist? What attributes must each span carry? What parent-child relationships are required? The instrumentation design shapes the code, not the other way around. This makes it essentially impossible to ship a feature that is unobservable.

### 6. Assert on telemetry in tests [#6-assert-on-telemetry-in-tests]

System tests assert that traces are unbroken end-to-end — a missing span is a test failure ([Testing](/docs/principles/foundations/testing)). This makes drift impossible: the instrumentation is part of the contract, not an optional decoration, and any regression catches before merge.

### 7. Logs are structured, sampled, and contextual [#7-logs-are-structured-sampled-and-contextual]

Every log line is structured (JSON), carries its trace ID, and is emitted at a severity that the team has actually agreed on. We sample aggressively at debug and info — nobody needs every log line in production — and we do not sample errors. Unstructured log lines are not logs; they are a different kind of noise.

### 8. Cardinality is a design choice [#8-cardinality-is-a-design-choice]

High-cardinality attributes (per-user, per-tenant, per-meeting) are valuable for debugging but expensive in storage. We tag deliberately — high cardinality on traces where it is queryable, lower cardinality on metrics where it multiplies by every time window. Runaway cardinality is one of the most expensive mistakes a team can make in observability; it is a design call, not a default.

## How we apply this [#how-we-apply-this]

* [Architecture / Observability](/docs/learn/architecture/observability) — the specific OTel collector topology.
* [Reliability](/docs/principles/quality/reliability) — the SLO layer built on top of this telemetry.
* [Testing](/docs/principles/foundations/testing) — how we assert on traces in system tests.
* [Performance](/docs/principles/quality/performance) — the latency work that depends on good tracing.

## Anti-patterns we reject [#anti-patterns-we-reject]

* **Pillar-at-a-time adoption.** "We'll add metrics now, traces later." You will not.
* **Vendor SDKs in application code.** Application code imports OpenTelemetry; the collector talks to the vendor.
* **Dashboards without SLOs.** Pretty charts without a question they are answering.
* **Logs-as-debugger.** Using `printf` style logging to trace a single bug. Write a test; add a span.
* **Print-statement-style `Debug` in production.** If every deploy adds ten debug logs and the next removes twelve, we are missing structure.
* **Cardinality explosions.** Putting a UUID in a Prometheus label. The bill and the query planner will both remember.

## Further reading [#further-reading]

* *Observability Engineering*, Majors, Fong-Jones, Miranda — the canonical text on traces-first observability.
* *Distributed Systems Observability*, Cindy Sridharan — the short, sharp introduction.
* *The OpenTelemetry specification* ([opentelemetry.io/docs/specs](https://opentelemetry.io/docs/specs)) — worth reading the high-level overview at least once.
* *Systems Performance*, Brendan Gregg — the canonical reference for the "USE method" (utilisation, saturation, errors).


# Performance (/docs/principles/quality/performance)


# Performance [#performance]

## TL;DR [#tldr]

Performance is not "fast enough" — it is a budget, spent deliberately across every hop of a user interaction and enforced in CI. We optimise for tail latency, we design backpressure into real-time flows, and we measure the things users feel, not the things developers find convenient.

## Why this matters [#why-this-matters]

Users notice latency before they notice almost anything else. A transcription that renders in 800ms feels instant; at 3000ms it feels broken. The difference is not a factor of four in effort — it is a difference of whether the team thought about latency as a design constraint or as a post-hoc tuning problem. Performance handled as an afterthought is invariably more expensive than performance designed in from the start.

## Our principles [#our-principles]

### 1. Latency is a budget, allocated top-down [#1-latency-is-a-budget-allocated-top-down]

Every user-facing operation starts with a latency budget at the edge — say, 500ms — and that budget is allocated to downstream hops. If the recap fetch has 300ms and the transcript join has 150ms, the handler has 50ms of its own work. When a hop overruns its allocation, somebody else's budget gets squeezed. The budgeting view makes trade-offs explicit.

### 2. Measure tail latency, not average [#2-measure-tail-latency-not-average]

p50 is a marketing number. p95 and p99 are what users experience. We measure and alert on the tail; we design for the tail. A system with a great median and a terrible p99 will have an awful reputation, no matter what the dashboard says.

### 3. Pre-compute, cache, and denormalise deliberately [#3-pre-compute-cache-and-denormalise-deliberately]

When a read is hot, we pre-compute. When a computation is stable, we cache. When a join is expensive, we denormalise. Each of these trades complexity for latency; each of them earns its keep with data, not with intuition. Speculative caching is how cache-invalidation bugs become the biggest source of data incidents.

### 4. Backpressure is designed in, not hoped for [#4-backpressure-is-designed-in-not-hoped-for]

Every producer has a bounded queue and a defined behaviour when the queue fills: shed, coalesce, block ([Real-Time](/docs/principles/system-design/real-time)). "It works fine in load tests" is not a backpressure strategy.

### 5. Load shedding protects the system from itself [#5-load-shedding-protects-the-system-from-itself]

When the system is saturated, the right behaviour is not to try harder — it is to serve fewer requests well. We shed on clearly-defined criteria: low-priority traffic first, new sessions before active ones, non-interactive before interactive. Shedding is a designed degradation mode, not an accident.

### 6. Hot paths have no allocations to spare [#6-hot-paths-have-no-allocations-to-spare]

For the hottest inner loops — real-time audio processing, per-turn ingestion — we write allocation-aware code. Every allocation is a GC pause in waiting, and at high rate the pauses become the latency. Most code does not need this discipline; the hot paths demand it.

### 7. Profile before you optimise [#7-profile-before-you-optimise]

Every non-trivial optimisation starts with a profile. The "obvious" bottleneck is almost always wrong, and effort spent tuning a cold path is effort wasted. We profile in production-representative conditions; profiles from developer laptops lie.

### 8. Budgets are enforced in CI [#8-budgets-are-enforced-in-ci]

Bundle sizes, lighthouse scores, worst-case handler latencies — these are measured in CI against committed thresholds. A PR that regresses a budget requires an explicit, reviewed waiver. Performance regressions that slip in once slip in a hundred times; automation is cheaper than vigilance.

## How we apply this [#how-we-apply-this]

* [Observability](/docs/principles/quality/observability) — the measurement surface for latency work.
* [Reliability](/docs/principles/quality/reliability) — the SLO discipline that makes performance budgets enforceable.
* [Frontend](/docs/principles/stack/frontend) — the client-side performance budgets.
* [Real-Time](/docs/principles/system-design/real-time) — the streaming-specific patterns we apply.

## Anti-patterns we reject [#anti-patterns-we-reject]

* **Optimising on hunch.** No profile, no optimisation.
* **"It is fast on my laptop."** Dev latency is not production latency. Measure in the environment that matters.
* **Average-as-metric.** p50 is a lie. Use percentiles.
* **Unbounded queues.** A queue without a max is a latency bomb.
* **Cache invalidation left to the reader.** If the cache can serve stale data under a defined circumstance, that circumstance is documented. Otherwise it is a bug.
* **"We will fix performance later."** If you ship slow, users will remember slow.

## Further reading [#further-reading]

* *Systems Performance*, Brendan Gregg — the canonical reference; read the USE and RED chapters first.
* *High Performance Browser Networking*, Ilya Grigorik — the frontend-and-network half of the story.
* *Latency Numbers Every Programmer Should Know* (Jeff Dean) — calibrate your intuition.
* Gil Tene, "How NOT to Measure Latency" — the talk on coordinated omission and why naive latency measurements lie.


# Privacy (/docs/principles/quality/privacy)


# Privacy [#privacy]

## TL;DR [#tldr]

We handle private audio from users who trusted us with it. Our privacy stance is that we only collect what we need, keep it only as long as we need it, expose it only where it is needed, and let users see, correct, and remove their own data on demand. Privacy is a design input, not a compliance appendage.

## Why this matters [#why-this-matters]

A privacy failure at a company like ours is not a regulatory inconvenience — it is a direct breach of the most sensitive interaction a user has with our product. A recording of someone's meeting, a transcript of a difficult conversation, a recap that includes names and numbers — none of this has the same forgiveness curve as a leaked login. Privacy has to be thought about at design time, because once the data exists in the wrong shape or the wrong place, remediation is punishingly expensive.

## Our principles [#our-principles]

### 1. Collect the minimum [#1-collect-the-minimum]

For every field we capture, we ask: do we actually need this to deliver the user's outcome? Email for authentication, audio for transcription, participation records for collaboration — yes. Browser fingerprint for "analytics" — almost never. Data minimisation reduces both privacy risk and operational complexity.

### 2. Retain for a bounded time [#2-retain-for-a-bounded-time]

Every category of data has an explicit retention policy set at collection time. Audio is transcribed and then deleted unless the user has opted into retention; Transcriptions follow the Meeting's retention policy; derived embeddings carry the shortest retention of the source. "We keep it forever" is never the answer; expired data is deleted by automation, not by a Tuesday-afternoon cron.

### 3. Access is scoped and audited [#3-access-is-scoped-and-audited]

Every internal access to user data is authenticated, authorised, and logged. Engineers cannot browse production data casually; support staff cannot read a transcript without a clear business reason and an auditable access record. Unsupervised access is a policy failure waiting to be discovered.

### 4. Users see, control, and remove their data [#4-users-see-control-and-remove-their-data]

Data subject rights — access, rectification, portability, deletion — are first-class features, not regulatory bolt-ons. A user's deletion request flows through the same plumbing as retention expiry: structured, automated, and verifiable. A deletion that leaves "just this one copy" around is a promise broken.

### 5. Design for data residency [#5-design-for-data-residency]

Where data lives matters — both for regulation (EU user data must stay on EU infrastructure for some purposes) and for user expectation. Residency is a design input to storage and pipeline choices, not an afterthought discovered during procurement.

### 6. PII is handled distinctly from content [#6-pii-is-handled-distinctly-from-content]

Email addresses, names, IPs — PII has a shorter retention, tighter access controls, and is explicitly not co-located with content where we can help it. The treat-all-data-the-same approach makes the problems of the most sensitive fields become the problems of every field.

### 7. Model training respects user choice [#7-model-training-respects-user-choice]

User data is used to train or evaluate models only when the user has given informed consent, and the consent record is auditable. Assuming consent because "everyone does" is not a posture we hold.

### 8. Privacy reviews happen before launch [#8-privacy-reviews-happen-before-launch]

Every feature that touches user data has a privacy review before it ships — the same rhythm as a security review, often in the same meeting. The reviewer asks the specific questions a regulator or an investigative journalist would, and the answers go on the record. "We will do the privacy review after launch" is a commitment that never gets honoured.

## How we apply this [#how-we-apply-this]

* [Data Engineering](/docs/principles/system-design/data-engineering) — retention and contract discipline.
* [Security](/docs/principles/quality/security) — the perimeter that privacy relies on.
* [Postgres](/docs/principles/stack/postgres) — retention enforced at the storage layer.
* [Operations](/docs/operations) — the incident response for a privacy event.

## Anti-patterns we reject [#anti-patterns-we-reject]

* **"Privacy is the lawyers' job."** By the time the lawyers are involved, the damage is done. Privacy is an engineering discipline.
* **Retention by default to forever.** Growing tables nobody cleans are ticking privacy incidents.
* **Development data scraped from production.** A dev environment with a sample of real user transcripts is a breach waiting to be noticed.
* **Analytics as a free pass.** "It is for analytics" is not a sufficient justification for collecting a piece of PII. The same bar applies.
* **PII in logs.** Trace and log data routinely outlives the systems that produced it. PII does not belong there.
* **Consent-by-omission.** Checking a "we may use your data to improve the model" box buried in a ToS is not consent.

## Further reading [#further-reading]

* *GDPR* text and ICO guidance — the canonical European framework.
* *CCPA/CPRA* — the Californian counterpart.
* *Privacy by Design*, Ann Cavoukian — the foundational essay on baking privacy into architecture.
* *Data Protection Impact Assessments* (ICO) — the practical model we use for privacy reviews.


# Reliability (/docs/principles/quality/reliability)


# Reliability [#reliability]

## TL;DR [#tldr]

Reliability is not a feature we add after the system is built. It is a design property we pay for up front, measured in error budgets, defended by graceful-degradation patterns, and rehearsed through deliberate failure injection. Every significant service owns an SLO and lives inside the error budget it implies.

## Why this matters [#why-this-matters]

Users do not experience "uptime percentages" — they experience "the thing I needed did not work just now." Reliability is the discipline of holding the second experience rare enough that users learn to trust the platform. In a real-time product like Wordloop, unreliability compounds: a dropped audio packet becomes a missed segment, a missed segment becomes a broken Transcription, a broken Transcription becomes an unusable MeetingSynthesis. The cost of a small reliability failure is rarely proportional to its scope.

## Our principles [#our-principles]

### 1. SLOs, not uptime percentages [#1-slos-not-uptime-percentages]

Every significant service defines a Service Level Objective — a per-endpoint or per-user-journey target with a latency and a success-rate component, measured over a rolling window. "99.9% uptime" is not an SLO; "p95 `POST /turns` \< 300ms over 30 days, 99.5% success" is. SLOs are the measurement surface for everything else in this page.

### 2. Error budgets govern velocity [#2-error-budgets-govern-velocity]

The budget implied by the SLO — the allowed volume of "bad" events — is a spendable resource. Teams spending below budget can ship riskier changes and run experiments. Teams above budget pause feature work and pay down reliability debt. This inversion — reliability as a gate on feature velocity rather than a tax on top of it — is what makes SLOs operationally real.

### 3. Graceful degradation is a design, not a hope [#3-graceful-degradation-is-a-design-not-a-hope]

Every user-facing feature has a defined behaviour when its downstream fails. A Meeting view without synthesis data still renders — the synthesis panel shows a "not yet ready" state. A transcription pipeline without a model client enqueues segments and returns when it can. Degradation is decided at design time and implemented alongside the happy path, never "we will figure out what to show later."

### 4. Timeouts, retries, and circuit breakers are defaults [#4-timeouts-retries-and-circuit-breakers-are-defaults]

Every outbound call has a timeout, every retry has a bounded policy with jitter, and every client has a circuit breaker against its most important downstreams ([Integration Patterns](/docs/principles/system-design/integration-patterns)). Defaults are set in a shared library so that a new service inherits them; opting out requires a written reason.

### 5. Isolate blast radius [#5-isolate-blast-radius]

A single tenant, a single user, or a single noisy consumer must not be able to degrade the experience for everyone else. We isolate by quota (per-tenant rate limits), by resource (dedicated queues for hot workloads), and by bulkhead (separate worker pools for separate work types). The design question is always: "if this goes bad, who else is affected?" — and the answer we aim for is "only the thing that went bad."

### 6. Rehearse failure [#6-rehearse-failure]

Chaos engineering is a practice, not an event. We inject failures — killed pods, degraded networks, slow databases — routinely in staging and, carefully, in production. The goal is not to "test if chaos works"; it is to discover the reliability assumptions we are making without knowing it. Every chaos experiment that finds something surprising is worth a year of CI.

### 7. Alerts fire on user impact, not on mechanism [#7-alerts-fire-on-user-impact-not-on-mechanism]

We alert when users are affected — SLO burn rate, error-rate spikes on user journeys — not when a server has 80% CPU. Pages that fire on mechanism without user impact teach on-call to ignore pages, which is how a real incident gets missed. See [On-Call](/docs/operations/on-call).

### 8. Every incident teaches a specific lesson [#8-every-incident-teaches-a-specific-lesson]

Post-incident, we write a blameless postmortem that names the specific reliability assumption the incident invalidated and proposes the specific change that would have caught it. We do not write "be more careful" as an action item. We do not write "add more monitoring" without specifying the signal. The goal is one concrete, closable ticket per incident, enforceable and measurable.

## How we apply this [#how-we-apply-this]

* [Observability](/docs/principles/quality/observability) — the measurement layer that makes SLOs possible.
* [Performance](/docs/principles/quality/performance) — the tail-latency discipline that sits inside reliability.
* [Integration Patterns](/docs/principles/system-design/integration-patterns) — the concrete patterns (timeouts, circuit breakers) we apply.
* [Operations](/docs/operations) — runbooks, on-call rotation, incident response.

## Anti-patterns we reject [#anti-patterns-we-reject]

* **"99.999% uptime" as a target.** Five-nines for a non-core service is a reckless budget. Set an SLO the team can defend.
* **Retries without policies.** Retry-forever is a self-inflicted DDoS.
* **Mechanism alerts.** Paging on CPU, memory, or disk without tying it to a user-impact signal. Noise.
* **"It has not failed yet."** The absence of a known failure mode is not evidence of its absence. Rehearse.
* **Postmortems that blame humans.** A system that depends on everyone being perfect will fail. The action item is the system fix, not the person lecture.
* **SLOs nobody tracks.** An SLO without a dashboard and a burn-rate alert is theatre.

## Further reading [#further-reading]

* *Site Reliability Engineering*, Beyer et al. (the Google SRE book) — the canonical text for SLOs, error budgets, and the operational stance.
* *The Site Reliability Workbook* — the practical companion to the SRE book; more actionable.
* *Release It!*, Michael Nygard — the stability-patterns bible.
* *Chaos Engineering*, Rosenthal & Jones — the current state of rehearsed-failure practice.


# Security (/docs/principles/quality/security)


# Security [#security]

## TL;DR [#tldr]

Security is every engineer's job, every day. We treat every service as untrusted, every dependency as a supply-chain risk, every input as hostile, and every secret as already-compromised unless we can prove otherwise. The goal is not zero risk — it is a system that stays standing when any single control fails.

## Why this matters [#why-this-matters]

Wordloop processes meeting audio. That data is confidential by default and sometimes regulated by statute. A security incident is not an inconvenience for us; it is a breach of the trust users place in us to handle the most private thing they say in a day. Security is the baseline that every other quality concern rests on. A system that is reliable but exploitable is not reliable.

## Our principles [#our-principles]

### 1. Zero trust between services [#1-zero-trust-between-services]

Services authenticate each other on every request. No "internal" network is trusted implicitly; every call carries an identity, every identity is authorised per operation. The breach-resistance argument is simple — if an attacker pivots into one service, they do not inherit the blast radius of the entire system.

### 2. Threat model the change, not just the product [#2-threat-model-the-change-not-just-the-product]

Every significant change asks the security question before the design is signed off: who could misuse this, and how? A new endpoint, a new data field, a new integration — each gets a five-minute threat conversation. This is cheap upfront and catches most of the issues that would otherwise be found in a pen test or, worse, in production.

### 3. Secrets are managed, rotated, and audited [#3-secrets-are-managed-rotated-and-audited]

No secret lives in source. Secrets live in a secret manager, are fetched at runtime, are rotated on a schedule, and every access is audited. A leaked secret's damage window is measured in hours, not years, because we assumed it would leak and planned for it.

### 4. Input is hostile; validate at the boundary [#4-input-is-hostile-validate-at-the-boundary]

Every piece of input at a trust boundary is validated: request bodies, webhook payloads, message queue events, model outputs. Inside the trust boundary we trust our own types and do not repeat the checks ([Code Craft](/docs/principles/foundations/code-craft)). The discipline is that the boundary is explicit and every crossing is scrutinised.

### 5. Supply chain is part of our attack surface [#5-supply-chain-is-part-of-our-attack-surface]

Every third-party dependency is a potential exploit vector. We pin versions, review new dependencies before adoption, run SBOM generation and vulnerability scans on every build, and follow SLSA supply-chain integrity practices. A dependency added without review is a back door added without review.

### 6. Least privilege by default [#6-least-privilege-by-default]

Every service, every database role, every cloud identity starts with the minimum permissions it needs and is extended only on evidence. "Give it admin and fix it later" is a decision with a lifetime of never. IAM policies, Postgres roles, and credential scopes are reviewed in the same way code is reviewed.

### 7. Auth is boring technology [#7-auth-is-boring-technology]

We do not invent auth. Clerk handles user authentication; service-to-service auth uses short-lived tokens from a standard identity provider; session storage follows the OWASP guidance for the context. Exotic auth is how a team learns about auth vulnerabilities the hard way.

### 8. Detect and respond, not just prevent [#8-detect-and-respond-not-just-prevent]

Assume prevention will sometimes fail. We log security-relevant events, alert on suspicious patterns, and run incident-response tabletops so the team knows what to do when something happens. Detection that arrives after the incident is cleaned up is not detection.

## How we apply this [#how-we-apply-this]

* [Privacy](/docs/principles/quality/privacy) — the handling of regulated data sits inside the security perimeter.
* [Reliability](/docs/principles/quality/reliability) — stability and security share a lot of failure-mode vocabulary.
* [API Design](/docs/principles/system-design/api-design) — signed webhooks, idempotency keys, and structured errors that do not leak internals.
* [Operations](/docs/operations) — runbooks for security-relevant incidents.

## Anti-patterns we reject [#anti-patterns-we-reject]

* **Internal network = trusted.** This is the assumption every modern breach exploits.
* **Secrets in environment variables checked into Git.** Use the secret manager. Always.
* **"It is an internal tool, we can skip auth."** Internal tools are an attacker's favourite foothold.
* **Dependencies pulled in on intuition.** A package with 12 stars, no maintainer, and a vague promise is a supply-chain risk.
* **Exotic auth.** Custom JWT handling, custom session cookies, custom MFA flows. Use the standard, battle-tested thing.
* **"The WAF will catch it."** A web application firewall is a last layer. Primary defence is correct code.

## Further reading [#further-reading]

* *The Tangled Web*, Michal Zalewski — the canonical tour of web-security oddness.
* *The Web Application Hacker's Handbook*, Stuttard & Pinto — read once to know what you are defending against.
* *OWASP Top 10* — the catalogue of vulnerabilities every web engineer must know.
* *SLSA Framework* ([slsa.dev](https://slsa.dev)) — the supply-chain integrity ladder.
* *Zero Trust Architecture*, NIST SP 800-207 — the canonical definition.


# Frontend (/docs/principles/stack/frontend)


# Frontend [#frontend]

## TL;DR [#tldr]

The frontend is Next.js 16 with React 19 Server Components, SWR for client-side data, and Tailwind for styling. We render on the server by default, hydrate the minimum needed for interactivity, and keep data-fetching at the leaves. The design language is consistent, calm, and accessible first.

## Why this matters [#why-this-matters]

Frontend engineering in 2026 is no longer a "client-side" discipline. Most of our pixels are rendered on the server before they reach the browser, and the browser's job is to stay responsive to the user's input rather than to fetch data on their behalf. Getting this split right — what runs on the server, what runs on the client, what streams in between — is the single biggest determinant of how fast and how reliable our app feels. It is also where most frontend bugs live.

## Our principles [#our-principles]

### 1. Server components are the default [#1-server-components-are-the-default]

Every component starts as a Server Component. We add `"use client"` only when we need state, events, or browser-only APIs. The economic argument is simple: every client component costs download, parse, hydrate, and memory on every user's device. Server components cost none of that. The ratio of server-to-client components in `wordloop-app` is tracked and kept high.

### 2. Data fetches at the leaves, not the root [#2-data-fetches-at-the-leaves-not-the-root]

Data fetching happens in the component that actually renders the data, not in a page-level fetcher that passes everything down through props. This lets Suspense boundaries stream exactly as deep as they need to, and it keeps prop-drilling in check. The exception is when two leaves need the same data — then we fetch once in a common ancestor and share via React's built-in request deduplication.

### 3. SWR for client-interactive state [#3-swr-for-client-interactive-state]

For state that must respond to user interaction in real time — live Meeting views, session state, optimistic updates — we use SWR. One cache, one invalidation story, one mental model. We do not mix query libraries in the same app.

### 4. Styling is Tailwind utilities, composed [#4-styling-is-tailwind-utilities-composed]

We style with Tailwind utility classes. When a composition of utilities gets long or is repeated, we extract a component — not a custom CSS class. Component extraction keeps the `className` strings honest; new CSS classes are where design systems go to die of untracked one-offs.

### 5. The design system is a library, not a guideline [#5-the-design-system-is-a-library-not-a-guideline]

Buttons, inputs, modals, tooltips — every primitive is a typed, reviewed component in the shared component library. Ad-hoc styling of a button in a feature folder is a smell; the fix is to add the variant to the library, not to reimplement it.

### 6. Accessibility is a baseline, not a feature [#6-accessibility-is-a-baseline-not-a-feature]

Every interactive component supports keyboard navigation, is screen-reader labelled, and meets WCAG 2.2 AA contrast at minimum. Accessibility failures block merges the same way type errors do — see [Accessibility](/docs/principles/quality/accessibility). The golden path for every new UI begins with "can I get to it, use it, and understand it with just the keyboard and a screen reader?"

### 7. Client state is recoverable [#7-client-state-is-recoverable]

We do not store state in React that cannot be rebuilt from the server or the URL. Refreshing the page is the end-to-end test of this: if the user loses context after a refresh, we are holding state we should not. The URL is a first-class state container; so are Server Components.

### 8. Performance budgets are enforced in CI [#8-performance-budgets-are-enforced-in-ci]

Largest Contentful Paint, Interaction-to-Next-Paint, JS bundle size — all tracked in CI with budgets. A PR that regresses a budget requires an explicit waiver. Performance is never negotiated after the fact; it is designed in ([Performance](/docs/principles/quality/performance)).

## How we apply this [#how-we-apply-this]

* [App Service Handbook](/docs/learn/services/app) — the architectural walkthrough for `wordloop-app`.
* [Accessibility](/docs/principles/quality/accessibility) — the baseline requirements every UI must meet.
* [Performance](/docs/principles/quality/performance) — the latency budgets we hold ourselves to.
* [Real-Time](/docs/principles/system-design/real-time) — the WebSocket layer that the live UI consumes.

## Anti-patterns we reject [#anti-patterns-we-reject]

* **`useEffect` for data fetching.** `useEffect` is an escape hatch for non-React systems; it is not a data-fetching primitive. Use Server Components or SWR.
* **Context for everything.** React Context is a tool for genuinely app-wide concerns (theme, auth, locale). Using it to avoid prop-drilling on three levels is overreach.
* **CSS Modules alongside Tailwind.** One styling system. Not three.
* **Ad-hoc design primitives.** Every new button variant is a tax on the design system. If it needs a variant, add it to the library.
* **State that cannot survive refresh.** Modals that disappear on refresh lose user context; counters that reset on refresh are not counters.
* **"Fix it in a later PR" accessibility.** The later PR will not happen. Ship accessible or ship later.

## Further reading [#further-reading]

* *React documentation* ([react.dev](https://react.dev)) — the canonical source for the Server Component mental model.
* *Patterns.dev*, Lydia Hallie & Addy Osmani — a clean survey of modern frontend patterns.
* *Inclusive Components*, Heydon Pickering — the pattern language of accessible component design.
* *Refactoring UI*, Schoger & Wathan — the design vocabulary that informs how we compose Tailwind.


# Go Services (/docs/principles/stack/go-services)


# Go Services [#go-services]

## TL;DR [#tldr]

Our Go code is boring on purpose. It leans into the standard library, uses interfaces only where they earn their keep, treats errors as values with context, and structures services with the gateway pattern — our Go-idiomatic expression of hexagonal architecture.

## Why this matters [#why-this-matters]

Go rewards the engineer who resists cleverness. A Go codebase that reads like the standard library is one that new engineers — and AI agents — can contribute to the day they arrive. A Go codebase that imports a dozen frameworks and wraps every primitive in a "clean" abstraction is one where understanding costs precede any productive work. `wordloop-core` is the largest single piece of Go we maintain, and keeping it legible is the highest-leverage investment we make.

## Our principles [#our-principles]

### 1. Standard library first [#1-standard-library-first]

`net/http`, `context`, `database/sql` — the standard library is the default, and we reach for a third-party package only when the standard library demonstrably cannot do the job. Frameworks that "wrap" the standard library to make it "easier" usually make it harder to reason about and harder for a new reader to follow.

### 2. Gateway pattern for services [#2-gateway-pattern-for-services]

Every service in `wordloop-core` is structured as a gateway: a thin HTTP handler at the edge that extracts and validates inputs, an application service that orchestrates, domain types that hold rules, and repository interfaces (our "ports") with Postgres-backed implementations. This is hexagonal in Go idioms — flat package layout, exported interfaces, unexported concrete types. See [Hexagonal Architecture](/docs/principles/system-design/hexagonal-architecture) for the underlying model.

### 3. Errors are values, and they carry context [#3-errors-are-values-and-they-carry-context]

We wrap errors with `fmt.Errorf("doing X: %w", err)` when adding context, and we inspect them with `errors.Is` and `errors.As` at the boundary. We *never* `panic` in service code; `panic` is reserved for truly unrecoverable conditions (a nil interface where the type system should have prevented it) and is recovered only at the HTTP boundary. Sentinel errors are defined where the caller must branch on them; structured error types are defined where the caller needs detail.

### 4. Context is threaded everywhere [#4-context-is-threaded-everywhere]

Every function that does I/O takes a `context.Context` as its first parameter. Cancellation and deadlines are respected. A goroutine that outlives its parent context without explicit opt-in is a bug. `context.Background()` appears only at program entry points and in tests.

### 5. Concurrency is simple or explicit [#5-concurrency-is-simple-or-explicit]

`go` statements that fire-and-forget are banned; every goroutine is tracked by a `sync.WaitGroup`, an `errgroup.Group`, or a channel that signals completion. Leaked goroutines are how a service slowly eats its memory. Shared state is accessed through channels by default, through mutexes when a channel would be awkward, and never through silence.

### 6. Interfaces are defined by consumers [#6-interfaces-are-defined-by-consumers]

We define interfaces in the package that consumes them — the "accept interfaces, return structs" rule. A package that defines an interface for its own use is leaking its internals as a public contract. Small interfaces (one to three methods) compose well; wide interfaces are a smell.

### 7. Dependency injection is manual and explicit [#7-dependency-injection-is-manual-and-explicit]

We do not use runtime dependency-injection frameworks. Dependencies are passed into constructors, and the composition happens in a `cmd/` entry point. This is the "composition root" pattern, and in Go it is both trivial and powerful — every dependency is visible at build time, and cycles are impossible by construction.

### 8. Tests use the real thing where possible [#8-tests-use-the-real-thing-where-possible]

Service tests spin up a real Postgres container via Testcontainers and exercise the handler through HTTP. We mock only the expensive external edges (model clients, third-party APIs). See [Testing](/docs/principles/foundations/testing) for the "emulate, don't mock" discipline.

## How we apply this [#how-we-apply-this]

* [Core Service Handbook](/docs/learn/services/core) — the architectural walkthrough for `wordloop-core`.
* [Add an API Endpoint](/docs/guides/add-api-endpoint) — the canonical change to a Go handler.
* [Hexagonal Architecture](/docs/principles/system-design/hexagonal-architecture) — the pattern the gateway expresses.
* [Real-Time](/docs/principles/system-design/real-time) — the streaming surface layered on top of HTTP.

## Anti-patterns we reject [#anti-patterns-we-reject]

* **Framework-flavoured Go.** Heavy router libraries, "web framework" abstractions, ORMs that rewrite your queries. The standard library is enough.
* **`interface{}` or `any` as a type.** Except at the boundary of reflection-based code, `any` is a signal you lost the type war. Name the type.
* **Package-level mutable state.** Config, loggers, metrics registries stored in package variables. Inject them, always.
* **Goroutines without supervision.** If you launch it, you own its lifetime. Track it.
* **`init()` functions that do work.** `init` is for registering, not for running. Work belongs in `main`.
* **Pointer-to-struct when a value would do.** Pointers imply "this may be nil or may be mutated." If neither is true, pass the value.

## Further reading [#further-reading]

* *The Go Programming Language*, Donovan & Kernighan — the canonical text.
* *100 Go Mistakes and How to Avoid Them*, Teiva Harsanyi — a cadastre of the gotchas that bite real codebases.
* *Dave Cheney's blog* ([dave.cheney.net](https://dave.cheney.net)) — the single best source on Go idioms, errors, and testing.
* *The Go Memory Model* (golang.org/ref/mem) — read it at least once if you are writing concurrent code.


# Stack (/docs/principles/stack)


# Stack [#stack]

Every language has a set of idioms that separate code that works from code that belongs. This section is where we commit to those idioms — explicitly, so that new contributors and agents can meet the bar without having to reverse-engineer it from the existing codebase.

<Cards>
  <Card title="Go Services" href="/docs/principles/stack/go-services" description="Idiomatic Go, gateway patterns, error handling, concurrency, and the shape of wordloop-core." />

  <Card title="Frontend" href="/docs/principles/stack/frontend" description="React 19, Server Components, SWR, Tailwind, and the design language of wordloop-app." />

  <Card title="ML Systems" href="/docs/principles/stack/ml-systems" description="Python, FastAPI, hexagonal ML architecture, model serving, and the disciplines that keep an AI runtime reliable." />

  <Card title="Postgres" href="/docs/principles/stack/postgres" description="Schema design, JSONB, migrations, indexing, and pgvector as our production vector store." />
</Cards>

If you are making a choice the stack page does not cover, it is probably a decision that belongs in an [ADR](/docs/decisions).


# ML Systems (/docs/principles/stack/ml-systems)


# ML Systems [#ml-systems]

## TL;DR [#tldr]

`wordloop-ml` is a Python FastAPI service structured hexagonally, with model clients, storage, and transport as adapters around a domain of transcription, recap, and embedding. We treat model calls as external dependencies with the same rigour as any other integration — timeouts, retries, budgets, idempotency — and we evaluate model outputs as a first-class part of the test suite.

## Why this matters [#why-this-matters]

Machine-learning code in most organisations is a separate zoo from the rest of the backend — different language norms, different testing discipline, different release discipline. We explicitly reject this separation. An ML service is a service; it must meet the same bars for reliability, observability, and maintainability as any other. What makes ML different is *what* we test (model behaviour, not just code behaviour), not *whether* we test.

## Our principles [#our-principles]

### 1. FastAPI for HTTP, Uvicorn for serving, uv for everything else [#1-fastapi-for-http-uvicorn-for-serving-uv-for-everything-else]

FastAPI for routing and validation, Uvicorn for serving, `uv` for dependency management. The Python ecosystem has a hundred alternatives for each of these; we pick one combination and apply it everywhere.

### 2. Hexagonal from the outset [#2-hexagonal-from-the-outset]

`wordloop-ml` has explicit `domain/`, `ports/`, `adapters/`, and `application/` packages, enforced by `import-linter` rules in CI. Model clients, storage, and the FastAPI router are all adapters. The domain — transcripts, recaps, embeddings — has no model-library imports. See [Hexagonal Architecture](/docs/principles/system-design/hexagonal-architecture).

### 3. Model calls are treated as external integrations [#3-model-calls-are-treated-as-external-integrations]

Every call to a model is wrapped in a `ModelClient` port, implemented by an adapter that handles timeouts, retries with jitter, circuit breaking, and rate-limit respect. The domain never knows which provider is behind the port. Swapping providers is an adapter change — nothing more.

### 4. Evals are part of the test suite [#4-evals-are-part-of-the-test-suite]

We maintain an eval set for every significant model-driven behaviour — recap quality, transcription accuracy, embedding consistency — and run it in CI on any change that could affect output. Evals produce numeric scores; thresholds are committed; regressions block merge the same way a failing unit test does. "The model got a little worse" is not an acceptable landing state.

### 5. Prompts are code, not configuration [#5-prompts-are-code-not-configuration]

Prompts live in version control, are reviewed, and are tested. They are *not* in a runtime config that someone can edit by accident. Prompt changes go through the same PR review as code changes, and they are covered by evals.

### 6. Observability spans both sides of the model call [#6-observability-spans-both-sides-of-the-model-call]

Every model call emits a trace span with input hash, prompt version, model ID, latency, token counts, and cost. An expensive prompt is visible before it is invoiced; a slow prompt is visible before it blocks a user. The model is not a black box inside our system — it is an instrumented dependency.

### 7. Caching and determinism are explicit [#7-caching-and-determinism-are-explicit]

When a model call can be cached — same input, same prompt version, same model — we cache it. Determinism parameters (temperature, seed) are set explicitly per use case; "whatever the default is" is not a choice. Caching is a first-order cost-engineering lever ([Cost Engineering](/docs/principles/delivery/cost-engineering)).

### 8. Stateful containers, not stateless [#8-stateful-containers-not-stateless]

Unlike our Go services, the ML service runs in stateful containers — models are loaded into memory on startup and kept warm for the life of the container. This is an intentional trade-off documented in an [ADR](/docs/decisions); cold-starting a large model per request is not viable at our scale.

## How we apply this [#how-we-apply-this]

* [ML Service Handbook](/docs/learn/services/ml) — the architectural walkthrough for `wordloop-ml`.
* [AI Engineering](/docs/principles/ai-native/ai-engineering) — the broader disciplines for building AI features.
* [Observability](/docs/principles/quality/observability) — how we trace and measure model calls.
* [Hexagonal Architecture](/docs/principles/system-design/hexagonal-architecture) — the structural pattern `wordloop-ml` follows most aggressively.

## Anti-patterns we reject [#anti-patterns-we-reject]

* **Model library imports in the domain.** If the domain imports `openai` or `torch`, the domain is no longer the domain.
* **Prompts in a runtime config.** Untracked, unreviewed, unversioned prompts will drift and break evals silently.
* **"It is an ML service, testing is different."** It is not. The tests just include evals.
* **Uncached expensive calls.** Every call with a stable input that we pay for twice is a bug.
* **Model outputs trusted blindly.** We validate shape, length, and content of model outputs at the adapter boundary. An unchecked model output flowing into the domain is an injection vector waiting to happen.
* **Synchronous long model calls on the request path.** Anything that takes more than a few hundred milliseconds queues to a worker and returns a job handle.

## Further reading [#further-reading]

* *Designing Machine Learning Systems*, Chip Huyen — the systems view of production ML.
* *Evaluating and Reinforcing LLM Behaviors*, Shreya Shankar et al. — the canonical treatment of eval design.
* *FastAPI documentation* — read the dependency-injection and pydantic chapters closely.
* *The Twelve-Factor App* — the ML service still respects all twelve, especially config, logs, and dependencies.


# Postgres (/docs/principles/stack/postgres)


# Postgres [#postgres]

## TL;DR [#tldr]

Postgres is the canonical data store for every service that needs persistence. We design schemas explicitly, migrate online, index deliberately, and use `pgvector` as our vector store. When the question is "which database?", the answer is Postgres unless we have a specific, written reason it cannot be.

## Why this matters [#why-this-matters]

Every additional datastore in a system is a multiplier on operational complexity: another backup story, another failure mode, another skill profile to hire for, another surface to monitor. Postgres is a remarkable outlier — it does relational, JSONB document storage, full-text search, and vector similarity well enough that most workloads never need another engine. Committing to it as a default keeps the operational surface small and the engineers productive.

## Our principles [#our-principles]

### 1. Schema design is a design document [#1-schema-design-is-a-design-document]

Every new table begins with a schema design: what does it represent, what identifies it, what are the invariants, what queries does it need to support, what retention does it live under. This is not a formality — schema shape is the contract that outlives any service that reads or writes the table ([Data Engineering](/docs/principles/system-design/data-engineering)).

### 2. Prefer columns to JSONB for stable shape [#2-prefer-columns-to-jsonb-for-stable-shape]

JSONB is powerful but it is not a replacement for column design. When a field is present on every row, queried often, and stable in meaning, it belongs in a column. JSONB is the right call when the shape varies per row, is rarely queried directly, or is a bag of external metadata. The default is columns.

### 3. Migrations are additive, reversible, and online [#3-migrations-are-additive-reversible-and-online]

Every migration is additive — new columns are nullable or carry sensible defaults, new tables start empty. We never block on a migration that rewrites a large table in a single transaction; long DDL runs online, with back-filling separated into background jobs. Rollback is pre-written; a migration without a rollback is not a finished migration ([Migrate the Schema](/docs/guides/migrate-schema)).

### 4. Indexes are evidence-based [#4-indexes-are-evidence-based]

Every index is justified by a query pattern backed by real production data — `pg_stat_user_indexes` and `pg_stat_statements` tell us which queries are hot and which indexes are paying their cost. Unused indexes cost write throughput and disk; we remove them. Speculative indexes "in case we need them later" are the opposite of the principle.

### 5. `pgvector` is our vector store [#5-pgvector-is-our-vector-store]

Semantic search, embedding similarity, RAG retrieval — all of this runs on `pgvector` in the same Postgres cluster as relational data. This is an explicit decision documented as an [ADR](/docs/decisions) in favour of operational simplicity over marginal performance. If we ever need a dedicated vector DB, the data and the requirement will make the case.

### 6. Connection management is explicit [#6-connection-management-is-explicit]

Every service manages its connection pool with deliberate sizing — max connections, idle timeouts, statement timeouts, per-service limits. "Just use the defaults" is how Postgres gets hammered into `too many connections` errors under load. Postgres is a shared resource; treat it like one.

### 7. Query patterns are reviewed [#7-query-patterns-are-reviewed]

Every new query is reviewed for plan shape, not just correctness. `EXPLAIN ANALYZE` on representative data is part of the PR for any non-trivial query. N+1 queries, full-table scans, and unbounded `IN` lists are caught in review, not in production.

### 8. Backups, retention, and disaster recovery are not afterthoughts [#8-backups-retention-and-disaster-recovery-are-not-afterthoughts]

Automated backups run with RPO and RTO targets that the business has signed off on. We test restores — a backup we have never restored is not a backup. Retention policies are set per table at creation time and aligned with the privacy policy ([Privacy](/docs/principles/quality/privacy)).

## How we apply this [#how-we-apply-this]

* [Database Reference](/docs/reference/database) — the schema, regenerated from migrations.
* [Migrate the Schema](/docs/guides/migrate-schema) — the canonical workflow for DDL changes.
* [Data Engineering](/docs/principles/system-design/data-engineering) — the broader treatment of data contracts.
* [Privacy](/docs/principles/quality/privacy) — the rules that shape retention and residency.

## Anti-patterns we reject [#anti-patterns-we-reject]

* **JSONB-everything.** Not a schema; a confession of avoided design.
* **Indexes "just in case."** Every index is a write tax; justify it from a query or remove it.
* **Migrations that lock a hot table.** `ALTER TABLE ... ADD COLUMN ... NOT NULL DEFAULT` on a 10M-row table. Use `NULL` first, backfill, then tighten.
* **Using Postgres as a message broker.** It can work; it is still not what we should be doing when we also run Pub/Sub.
* **Raw string interpolation into queries.** Parameterised queries, always. This is a security rule ([Security](/docs/principles/quality/security)) and a clarity rule.
* **A second database "just because."** Adding Redis or DynamoDB without a specific, documented need Postgres cannot meet. Most of the time, Postgres can.

## Further reading [#further-reading]

* *PostgreSQL: Up and Running*, Obe & Hsu — a practical, current reference.
* *The Art of PostgreSQL*, Dimitri Fontaine — advanced patterns with a teaching bent.
* *Designing Data-Intensive Applications*, Martin Kleppmann — the systems-level argument for relational-as-default.
* *pgvector documentation* ([github.com/pgvector/pgvector](https://github.com/pgvector/pgvector)) — the canonical source for vector index strategies.


# API Design (/docs/principles/system-design/api-design)


# API Design [#api-design]

## TL;DR [#tldr]

Every Wordloop API starts as a contract — OpenAPI for HTTP, AsyncAPI for events — and the code is generated from that contract. APIs are versioned deliberately, evolved additively, and shaped so that both human developers and AI agents can consume them without surprise.

## Why this matters [#why-this-matters]

An API is the most durable commitment a service makes. Once it is in production and a client depends on it, changing it is expensive; breaking it is catastrophic. The discipline of API design is not about getting the first version "right" — it is about making the next ten versions safe to ship. In 2026, the stakes are higher still: agents read our APIs programmatically, generate clients against them, and compose them into workflows we did not design. A poorly shaped API is no longer just a developer-experience problem; it is an agent-productivity problem.

## Our principles [#our-principles]

### 1. Contract-first, code-generated [#1-contract-first-code-generated]

Specs live in `/specs` and are the source of truth. Server handlers, typed clients, and reference docs are generated from them. We write the spec before we write the handler, and we let the generator produce both sides of the wire. Hand-rolled clients drift; generated clients cannot.

### 2. Explicit versioning, additive evolution [#2-explicit-versioning-additive-evolution]

Breaking changes require a new major version — `/v2`, or equivalent media-type negotiation — and a documented deprecation window for the prior version. Within a major version, we evolve *additively*: new optional fields, new endpoints, new response codes. Existing clients must never break because we extended the schema.

### 3. Resources, not RPCs [#3-resources-not-rpcs]

HTTP endpoints model resources (`POST /loops`, `GET /loops/{id}/turns`), not verbs (`POST /createLoop`). The resource shape forces us to think about identity, lifecycle, and composition up front. When a true verb is unavoidable (`POST /loops/{id}/transcribe`), we name it carefully and document why a resource shape does not fit.

### 4. Idempotency by design [#4-idempotency-by-design]

Every write endpoint accepts an `Idempotency-Key` header. Clients that retry on failure — which includes every agent we run — must be able to do so safely. The server is responsible for storing the key long enough to detect replays, not the client for being careful.

### 5. Pagination and filtering are uniform [#5-pagination-and-filtering-are-uniform]

Every collection endpoint paginates with the same cursor shape, filters with the same query-string grammar, and returns the same `next`/`prev` link structure. Reading one collection teaches you every collection. Inconsistent pagination between endpoints is a design smell that never scales.

### 6. Errors are structured and machine-readable [#6-errors-are-structured-and-machine-readable]

Every error response carries a stable code, a human message, and a `details` object. Clients — and especially agents — branch on the code, not on the prose. Error codes are catalogued in [Reference / Errors](/docs/reference/errors) and never renumbered.

### 7. AI-agent readiness is a first-class concern [#7-ai-agent-readiness-is-a-first-class-concern]

OpenAPI specs include rich descriptions on every field, enumerations for every finite domain, and explicit examples on every endpoint. An agent reading the spec should be able to use the API correctly without reading the handler. This is the difference between a spec that compiles and a spec that teaches.

### 8. Async events are contracts too [#8-async-events-are-contracts-too]

We treat WebSocket and Pub/Sub events with the same rigour as HTTP — an AsyncAPI spec, generated client and server models, additive evolution. Events that are "informal" today are the integration bugs of next quarter.

## How we apply this [#how-we-apply-this]

* [Core API Reference](/docs/reference/api/core) — rendered directly from the OpenAPI spec.
* [Core Events Reference](/docs/reference/events/core-ws) — rendered from the AsyncAPI spec.
* [Add an API Endpoint](/docs/guides/add-api-endpoint) — the canonical workflow.
* [Code Generation](/docs/guides/code-generation) — the tooling that keeps code in sync with specs.

## Anti-patterns we reject [#anti-patterns-we-reject]

* **Breaking changes without a version bump.** "It is a small breaking change, no one uses that field" — the assumption is always wrong in an agent-consuming world.
* **Hand-written clients.** Clients drift, and drift causes outages. Generate.
* **Kitchen-sink endpoints.** `POST /doThing` that accepts a 40-field payload and does everything. Split it.
* **Error payloads as strings.** A 400 response body of `"invalid input"` is unusable by any automated caller. Structured errors, always.
* **Endpoint-scoped pagination conventions.** Cursor in the body here, page-number in a query string there, offset-limit somewhere else. Pick one and apply it universally.

## Further reading [#further-reading]

* *Designing Web APIs*, Jin, Sahni, Shevat — the working bible of HTTP API design.
* *Web API Design: The Missing Link*, Apigee — the short handbook that gets the REST vocabulary right.
* *AsyncAPI Specification* ([asyncapi.com](https://www.asyncapi.com)) — the canonical format for async contracts.
* *OpenAPI Specification* ([openapis.org](https://www.openapis.org)) — the canonical format for HTTP contracts.


# Data Engineering (/docs/principles/system-design/data-engineering)


# Data Engineering [#data-engineering]

## TL;DR [#tldr]

Data outlives services. We treat every event we emit and every table we own as a long-term contract, shaped so downstream consumers — today and in three years — can work with it without archaeology. Events are append-only, schemas are versioned, and the log of what happened is preserved even when the current-state projection is rebuilt.

## Why this matters [#why-this-matters]

Services are replaced; data lives on. The user records, Meeting histories, and Transcriptions created by this year's Wordloop will still be in the database when the code that created them has been rewritten twice. The data contracts we set today — table shapes, event payloads, field semantics — are the single most durable thing we will produce. Getting the contract right once is cheap; changing it retroactively after the data has multiplied is brutal.

## Our principles [#our-principles]

### 1. Events are append-only and immutable [#1-events-are-append-only-and-immutable]

Once an event is emitted, it is never rewritten. Correction happens through *compensating* events (a "segment-deleted" event that references the original), not through mutation of the original. This is the discipline that lets downstream consumers trust the event log as a truthful history of the system.

### 2. Schemas are versioned and evolvable [#2-schemas-are-versioned-and-evolvable]

Event payloads have explicit versions. New fields are additive; removed fields are deprecated with a deadline, not removed silently. Consumers can detect an old schema and handle it or refuse it — they are never surprised. This is the AsyncAPI discipline ([API Design](/docs/principles/system-design/api-design)) applied to every stream.

### 3. Partition keys are chosen deliberately [#3-partition-keys-are-chosen-deliberately]

Event topics partition by the identifier that matters for ordering — typically `meeting_id` — so that all events for a single Meeting flow through a single partition in sequence. Choosing a partition key casually is one of the most expensive mistakes in a data system; we treat it as a design decision that deserves review.

### 4. CQRS where it pays [#4-cqrs-where-it-pays]

For read-heavy surfaces with complex projections — the synthesis dashboard, the Meeting timeline — we maintain a read model separate from the write model. The write model owns truth; the read model owns query performance. We do not apply CQRS universally; we apply it where the read load and the write load have genuinely different shapes.

### 5. Event sourcing is a tool, not a religion [#5-event-sourcing-is-a-tool-not-a-religion]

For domains where the history of change is itself the product — audit logs, participation timelines — we store the event log as the primary artefact and derive current state from it. For domains where current state is what matters, we store current state and publish events as derivatives. Event sourcing every table "because it is purer" is overengineering.

### 6. Data contracts are documented, versioned, and owned [#6-data-contracts-are-documented-versioned-and-owned]

Every significant table and every published event has an owner, a documented schema, a migration history, and a compatibility policy. Consumers find this on the [Database Reference](/docs/reference/database) and the [Events Reference](/docs/reference/events/core-pubsub). Unowned tables and undocumented events are a ticking integration-debt clock.

### 7. Retention is a design decision [#7-retention-is-a-design-decision]

Every dataset we store has a retention policy — deletion after N days, archival after M days, live forever. Retention is decided when the dataset is created, reviewed when the regulatory surface changes ([Privacy](/docs/principles/quality/privacy)), and enforced by automation. "We will figure it out later" is the decision that becomes a compliance incident three years later.

### 8. Backfills are a planned operation [#8-backfills-are-a-planned-operation]

Changing the shape of historical data — renaming a field, re-computing a derived column — is a project with a plan, a rollback, and a measurement. We do not backfill by running a script on a Tuesday and hoping. Backfills are rehearsed in staging and measured in production.

## How we apply this [#how-we-apply-this]

* [Database Reference](/docs/reference/database) — the authoritative schema of Postgres.
* [Core Pub/Sub Events](/docs/reference/events/core-pubsub) — the AsyncAPI contract for downstream consumers.
* [Migrate the Schema](/docs/guides/migrate-schema) — the discipline around DDL changes.
* [Postgres](/docs/principles/stack/postgres) — how we apply these principles inside our chosen database.

## Anti-patterns we reject [#anti-patterns-we-reject]

* **Silent schema changes.** Renaming a column in a hot table without coordinating consumers. This is how outages start.
* **Mutable event logs.** Going back and "fixing" a past event. The event is what happened; the correction is a new event.
* **Kitchen-sink "events" table.** One table that accepts a JSON blob for every kind of event. The type system is the best friend of a data contract; do not throw it away.
* **Backfills in production without rehearsal.** See above.
* **Retention by accident.** Tables that grow forever because no one considered retention at creation time.

## Further reading [#further-reading]

* *Designing Data-Intensive Applications*, Martin Kleppmann — the single best survey of the territory, including the chapters on derived data, stream processing, and batch processing.
* *Data Mesh*, Zhamak Dehghani — the argument for treating data as a first-class product with owners.
* *Streaming Systems*, Akidau, Chernyak, Lax — the deep treatment of time, watermarks, and windowing in stream processing.
* *Event Sourcing and CQRS*, Vaughn Vernon (the relevant chapters of *Implementing DDD*) — a grounded, implementation-focused view.


# Hexagonal Architecture (/docs/principles/system-design/hexagonal-architecture)


# Hexagonal Architecture [#hexagonal-architecture]

## TL;DR [#tldr]

Every non-trivial service we build is structured as a hexagon: a domain core surrounded by ports (interfaces) and adapters (implementations at the edges). Dependencies always flow inward — the domain depends on nothing; adapters depend on the domain through ports. This is the single highest-leverage structural choice we make, and it is deliberately non-negotiable for new services.

## Why this matters [#why-this-matters]

Hexagonal architecture — also called *ports and adapters* — was first articulated by Alistair Cockburn in 2005 as a way to keep the "inside" of an application (its business logic) isolated from the "outside" (databases, HTTP, message queues, UI). Twenty years later, it is an obvious fit for an AI-assisted engineering workflow for one reason that did not exist when Cockburn first wrote about it:

> **Agents perform best inside environments with strong, consistent structural constraints.**

In a hexagonal codebase, every file has a determined place: domain entities in the core, ports as interfaces, adapters at the edges, application services orchestrating them. When an agent is asked to add a new endpoint, a new integration, or a new persistence backend, the answer to "where does this code live?" is already decided by the architecture. The agent does not have to invent the layout — it inherits it. This collapses the decision space dramatically and produces code that reliably matches the existing shape.

We have seen this effect directly in `wordloop-ml`, which is the most aggressively hexagonal service in the platform. Agent-authored changes to `wordloop-ml` converge faster, require less rework, and land with fewer review comments than equivalent changes to less-structured code elsewhere. The architecture pays for itself every time.

## Our principles [#our-principles]

### 1. The domain depends on nothing [#1-the-domain-depends-on-nothing]

The innermost layer — the domain — contains the entities, value objects, and business rules that define what the service is. It imports no framework, no driver, no HTTP library, no SQL client. This is not dogma; it is the mechanism that makes the rest of the architecture work. A domain with framework imports cannot be tested in isolation, cannot be reused across adapters, and cannot be reasoned about independently of the infrastructure below it.

### 2. Ports are interfaces owned by the domain [#2-ports-are-interfaces-owned-by-the-domain]

A port is an interface declared *in the domain's language*, describing a capability the domain needs. `TranscriptRepository`, `EventPublisher`, `ModelClient` — each port speaks in terms the domain cares about, not in terms of the underlying technology. Crucially, the port is owned by the domain, not by the adapter. An adapter is expected to conform to the port; the port is never shaped around what is convenient for the adapter.

### 3. Adapters live at the edges and are interchangeable [#3-adapters-live-at-the-edges-and-are-interchangeable]

Adapters translate between the outside world and the ports. A Postgres adapter implements `TranscriptRepository`; a gRPC adapter implements `ModelClient`; an HTTP adapter at the *driving* edge turns inbound requests into calls on the application service. Adapters are interchangeable — swapping Postgres for DynamoDB should require zero change to the domain or to any other adapter. If it requires more, the port is leaking implementation details and must be redesigned.

### 4. Dependencies flow inward, and this is enforceable [#4-dependencies-flow-inward-and-this-is-enforceable]

The fundamental rule: an adapter may depend on a port (which lives in the domain), but the domain may never depend on an adapter. This rule is automatable — `depguard` in Go, `import-linter` in Python, ESLint import rules in TypeScript — and we enforce it in CI. Code that violates the inward-flow rule fails the build. This is what turns "hexagonal" from a style into a guarantee.

### 5. The application service orchestrates; it does not contain business rules [#5-the-application-service-orchestrates-it-does-not-contain-business-rules]

Application services (often called use-case services) coordinate ports and domain entities to fulfil a use case. "Process an incoming TranscriptSegment" or "generate a MeetingSynthesis for this Meeting" is an application-service method. Business rules — "a MeetingSynthesis cannot be generated until the Transcription is finalised" — live in the domain entity, not in the application service. The split is subtle but load-bearing: mixing rules into orchestration means the rules are not portable across drivers (CLI, HTTP, background job), which defeats the point.

### 6. Ports are natural test seams [#6-ports-are-natural-test-seams]

Hexagonal makes the core domain trivially testable without touching infrastructure, because every outbound dependency is a port that can be stubbed or replaced with a high-fidelity emulator. At the same time, application services can be tested end-to-end with real adapters (see [Testing](/docs/principles/foundations/testing)) because the adapters are narrow and replaceable. The architecture tells you what to test with a real container and what to test with a stub: test the adapter against the real thing it wraps; test the application service against stubs of the ports it consumes.

### 7. Keep the hexagon shallow [#7-keep-the-hexagon-shallow]

Hexagonal is not an invitation to pile on layers. The mistake we actively guard against is the "onion with ten rings" pattern — entity layer, repository layer, service layer, handler layer, DTO mapping layer, controller layer, and on and on. Three conceptual zones is enough: **domain**, **ports + application services**, **adapters**. Anything more is ritual, not rigour.

### 8. The architecture is language-agnostic [#8-the-architecture-is-language-agnostic]

Hexagonal is a mental model, not a framework. It applies equally in Go, Python, TypeScript, and any future language we adopt. The file-layout conventions differ — in Go we tend toward flat package trees with internal interfaces; in Python we use explicit `ports/` and `adapters/` directories; in TypeScript we use feature folders with `*.port.ts` and `*.adapter.ts` suffixes — but the structure and the dependency rule are the same everywhere. Agents and engineers who internalise the pattern stay productive when the stack changes.

## How we apply this [#how-we-apply-this]

**In `wordloop-core` (Go).** The service follows a gateway pattern that implements hexagonal in Go idioms: handlers at the edge call into application services, which depend on repository and publisher interfaces. Postgres, Pub/Sub, and Clerk are adapters. See the [Core Service handbook](/docs/learn/services/core) for the layout.

**In `wordloop-ml` (Python).** Explicit `domain/`, `ports/`, `adapters/`, and `application/` packages enforced by `import-linter` rules in CI. Model clients, storage backends, and the FastAPI router are all adapters. See the [ML Service handbook](/docs/learn/services/ml).

**In `wordloop-app` (TypeScript).** Frontend code does not need a full hexagonal split — the "outside" is the browser and the "inside" is React component state — but we apply the spirit of the pattern by isolating network I/O behind a thin SWR layer and keeping pure rendering logic free of data-fetching concerns. See the [App Service handbook](/docs/learn/services/app).

**In new services.** Any new backend service ships hexagonal from day one. The bootstrapping template includes the directory layout, the import-linter rules, and a stub domain + one adapter to demonstrate the flow.

## Anti-patterns we reject [#anti-patterns-we-reject]

* **Framework-coupled domain.** If the domain imports `fastapi.Request` or `gin.Context`, the domain is no longer the domain.
* **Anaemic domain models.** Data classes with no behaviour and a thick application service that knows all the rules. The rules belong on the entities.
* **Leaky ports.** A port with `gorm.DB` in its signature is not a port — it is a Postgres interface wearing a costume.
* **"Pragmatic" layer-skipping.** Handlers that talk directly to repositories because "it is just a simple endpoint." This is how architecture erodes: one simple endpoint at a time.
* **Per-adapter domain types.** Different entity definitions in the domain vs. the persistence layer vs. the API layer. Map across boundaries explicitly in the adapter, not by redefining the type.
* **Onion-style over-layering.** Five layers of DTO translation between HTTP and the domain. Adapters should be thin — a handler reads request, calls an application service, writes response. That is it.

## Further reading [#further-reading]

* *Hexagonal Architecture*, Alistair Cockburn (2005) — the original source. Read this first.
* *Implementing Domain-Driven Design*, Vaughn Vernon — the expanded, practical treatment of the pattern and its relationship to DDD.
* *Clean Architecture*, Robert C. Martin — an overlapping framework (the "clean" vs. "hexagonal" distinction is mostly vocabulary) that makes the dependency-inversion argument explicitly.
* *Get Your Hands Dirty on Clean Architecture*, Tom Hombergs — a code-first tour in Java that translates cleanly to other languages.
* Mark Seemann's essays on [blog.ploeh.dk](https://blog.ploeh.dk) — in particular his treatment of ports-as-dependencies and the composition root pattern.


# System Design (/docs/principles/system-design)


# System Design [#system-design]

System design is where abstract principle meets concrete structure. The choices in this section — how we shape services, how our APIs evolve, how real-time flows handle backpressure, how data moves, how systems integrate with each other — are the ones that compound most heavily as the platform grows.

**Hexagonal architecture is the cornerstone.** It appears first in this list deliberately. Every other page here assumes that services are internally structured as ports and adapters around a domain core — if that assumption collapses, nothing else we write about system design applies cleanly.

<Cards>
  <Card title="Hexagonal Architecture" href="/docs/principles/system-design/hexagonal-architecture" description="Ports, adapters, unidirectional dependency flow, and why this structure is the single highest-leverage choice for an agent-led codebase." />

  <Card title="API Design" href="/docs/principles/system-design/api-design" description="Contract-first design, versioning, evolution, pagination, and AI-agent readiness." />

  <Card title="Real-Time" href="/docs/principles/system-design/real-time" description="WebSockets, streaming, backpressure, echo suppression, and the resiliency patterns that make live experiences survive degraded networks." />

  <Card title="Data Engineering" href="/docs/principles/system-design/data-engineering" description="Events, streams, CQRS, event sourcing, and the data contracts that outlive any service." />

  <Card title="Integration Patterns" href="/docs/principles/system-design/integration-patterns" description="Webhooks, the outbox pattern, idempotency, and the sync-vs-async trade-off." />
</Cards>


# Integration Patterns (/docs/principles/system-design/integration-patterns)


# Integration Patterns [#integration-patterns]

## TL;DR [#tldr]

Services integrate through a small set of well-understood patterns: synchronous request/response for reads and strict writes, asynchronous events for everything else, the transactional outbox for "database and event must agree," webhooks for pushing to third parties, and idempotency everywhere. We pick the pattern based on the guarantee the integration needs, not on whatever felt easy at the time.

## Why this matters [#why-this-matters]

Most production incidents we have seen — across teams, across years — trace back to an integration that chose the wrong consistency model. A synchronous call where an async event belonged, an event without idempotency, a "fire and forget" webhook that silently dropped on a retry. Integration patterns are one of the few areas where the cost of getting it wrong is paid every day, forever, in an intermittent stream of weirdness. Getting them right means thinking about guarantees explicitly, not architectural fashion.

## Our principles [#our-principles]

### 1. Default to async; upgrade to sync when required [#1-default-to-async-upgrade-to-sync-when-required]

For any inter-service communication, async events are the default. We upgrade to synchronous RPC only when we need the response value inline (most user-facing reads) or when the caller needs the commit to have happened before it proceeds (strict writes). Making sync the default couples services together in ways that are invisible in code and disastrous at load.

### 2. Use the outbox pattern when a DB write and an event must agree [#2-use-the-outbox-pattern-when-a-db-write-and-an-event-must-agree]

When a state change requires both a database write and an event emission, and we need both or neither, we use the transactional outbox: write the event to an `outbox` table inside the same transaction as the state change, then a worker relays the outbox to the broker. This is the only correct solution in a world without distributed transactions, and the only alternative ("just emit after commit") leaks inconsistencies whenever the process dies between the two steps.

### 3. Every consumer is idempotent [#3-every-consumer-is-idempotent]

Every message handler — webhook receiver, Pub/Sub worker, retry-on-failure task — is idempotent. It either carries its own de-duplication key or it operates on keys that make replay safe (an `UPSERT` on a natural key, a conditional update guarded by a version). "At-least-once delivery" is the only delivery guarantee we ever get, and idempotent handling is the only response that works.

### 4. Retries have policies, not just defaults [#4-retries-have-policies-not-just-defaults]

Every retry policy has an explicit maximum, an explicit backoff curve, and an explicit dead-letter destination. "Retry forever with 1-second backoff" is not a policy — it is how a transient failure becomes a thundering herd. Retries that hit the dead-letter queue fire an alert; the queue is not a garbage bin.

### 5. Webhooks verify, sign, and replay [#5-webhooks-verify-sign-and-replay]

Inbound webhooks are authenticated with an HMAC signature over the payload, not with a shared secret in the query string. Outbound webhooks are signed the same way. Both sides support replay (the receiver stores the signature, rejects duplicates) and both sides surface a retry history to the sender. Unsigned webhooks are not webhooks; they are unauthenticated POST endpoints.

### 6. Timeouts are end-to-end budgets [#6-timeouts-are-end-to-end-budgets]

Every synchronous call has a timeout, and the timeout is allocated from a *budget* set by the outermost caller. A request with a 2-second budget at the edge does not get to spend 1.5 seconds on a single downstream call — that leaves no slack for retries, for the handler itself, or for the next downstream. Budgeting is a cooperative discipline; without it, tail latencies compound unpredictably ([Performance](/docs/principles/quality/performance)).

### 7. Circuit breakers protect the system from itself [#7-circuit-breakers-protect-the-system-from-itself]

When a downstream is failing, we stop calling it. A circuit breaker opens after a threshold of failures, trips the calls to fast-failure, and probes the downstream periodically to see if it has recovered. This protects us from hammering a recovering service and protects upstream callers from tying up threads waiting for an inevitable timeout.

### 8. Every integration has a contract test [#8-every-integration-has-a-contract-test]

A test that exercises the real integration — the real signature verification, the real retry curve, the real idempotency behaviour — runs in CI against an emulator. "It works in the happy path" is not a test; an integration that has only happy-path coverage is an incident waiting for its trigger.

## How we apply this [#how-we-apply-this]

* [Core Pub/Sub Events](/docs/reference/events/core-pubsub) — the async event contracts.
* [Core API Reference](/docs/reference/api/core) — the sync contracts.
* [Reliability](/docs/principles/quality/reliability) — the broader system-level treatment of failure modes.
* [Testing](/docs/principles/foundations/testing) — the contract-testing discipline.

## Anti-patterns we reject [#anti-patterns-we-reject]

* **Sync chains three deep.** Service A calls B calls C calls D. Every failure mode in the chain is now a failure mode for A.
* **"Fire and forget" webhooks.** No signature, no retry, no idempotency. Works once; the next incident it causes is unfixable from the outside.
* **Commit-and-then-publish.** Without the outbox, the two-step process will leave the system inconsistent every time a process dies between steps. It will happen.
* **Global retry policies.** "All HTTP calls retry 3 times with 1-second backoff." What matters is the *specific* downstream's failure profile and the caller's latency budget.
* **Dead-letter queues as logs.** If the DLQ is silently accumulating, integration is not working; it is just failing quietly. Alert and act.

## Further reading [#further-reading]

* *Release It!*, Michael Nygard — the canonical treatment of stability patterns (circuit breakers, bulkheads, timeouts).
* *Enterprise Integration Patterns*, Hohpe & Woolf — old but foundational; the vocabulary most of this page inherits.
* *Microservices Patterns*, Chris Richardson — a practical mapping of these patterns onto a modern service architecture.
* *Pat Helland, "Life Beyond Distributed Transactions"* — the paper that made the outbox pattern obvious in retrospect.


# Real-Time (/docs/principles/system-design/real-time)


# Real-Time [#real-time]

## TL;DR [#tldr]

Real-time features are long-lived bidirectional contracts between client and server, not request-response interactions. They must survive reconnection, handle backpressure without losing state, and never treat the network as a guarantee. Every real-time feature we ship is designed against the assumption that the connection will drop mid-flight.

## Why this matters [#why-this-matters]

A Meeting is a real-time experience: audio streams up, transcript segments stream down, people come and go, and none of this can pause for a refresh. The difference between a real-time product that feels smooth and one that feels broken is almost always the difference in how the implementers thought about failure modes: reconnection, duplicated messages, out-of-order delivery, and backpressure. Getting real-time right is less about picking a protocol and more about having a disciplined stance on those failure modes.

## Our principles [#our-principles]

### 1. WebSocket is the default transport [#1-websocket-is-the-default-transport]

For bidirectional, persistent connections we use WebSockets. Server-Sent Events are fine for one-way server-to-client streams but we avoid them for anything a client might want to influence. Long-polling is rejected outright — it gives the worst of both latency profiles.

### 2. Every message carries a sequence number [#2-every-message-carries-a-sequence-number]

Every event on the wire carries a monotonic sequence. The client can detect gaps; the server can detect duplicates; the pair of them can resynchronise after a reconnect without the client having to refetch everything. Sequence numbers are per-session, not global.

### 3. Reconnection is the normal case, not the error case [#3-reconnection-is-the-normal-case-not-the-error-case]

Clients reconnect with exponential backoff, jitter, and a resumption token that tells the server where they left off. "Reconnected" is logged, not paged. If a reconnection rate spikes, that is a signal about network or server health — not a client bug.

### 4. Backpressure is explicit [#4-backpressure-is-explicit]

When the server is producing faster than the client is consuming, the server either **sheds** (drops non-critical messages, logs the shed rate), **coalesces** (merges consecutive updates into a single event), or **blocks** (applies flow control). What it does *not* do is buffer unbounded. Buffering unbounded is how a real-time service dies.

### 5. Echo suppression is a design concern [#5-echo-suppression-is-a-design-concern]

In a Meeting with live transcription, a person's outgoing audio must not be re-ingested as incoming segments. The mechanism — whether it is a speaker ID on the stream, a per-session filter at the gateway, or client-side gating — is part of the protocol design, not a bolt-on fix. Echo handling is specified before the first line of code.

### 6. Idempotent handlers [#6-idempotent-handlers]

The client will reconnect and resend. The server must handle the resend gracefully — the same sequence number processed twice has no additional effect. This is the same principle as HTTP idempotency keys ([API Design](/docs/principles/system-design/api-design)), applied to the streaming surface.

### 7. Observability is unbroken across the socket [#7-observability-is-unbroken-across-the-socket]

A trace that enters via HTTP, opens a WebSocket, streams 10,000 events, and closes — all of it belongs to the same trace. Our OTel instrumentation propagates the trace context into the socket, and every event carries it forward. A broken real-time trace is a test failure, not a nuisance (see [Testing](/docs/principles/foundations/testing)).

### 8. Client state is recoverable, not sacred [#8-client-state-is-recoverable-not-sacred]

Any state held on the client that matters must be recoverable from the server. We do not rely on the client's in-memory view surviving. If the client crashes or navigates away, rejoining the session should produce the same observable state — the server is the source of truth.

## How we apply this [#how-we-apply-this]

* [Core Events Reference](/docs/reference/events/core-ws) — the WebSocket event catalogue.
* [Data Flow](/docs/learn/architecture/data-flow) — the end-to-end lifecycle of a live turn.
* [Reliability](/docs/principles/quality/reliability) — the broader resiliency patterns these principles sit inside.
* [Observability](/docs/principles/quality/observability) — how we trace across streaming connections.

## Anti-patterns we reject [#anti-patterns-we-reject]

* **Treating the socket as a fire-and-forget event bus.** No sequence numbers, no resumption, no idempotency. Works in demos; breaks in production.
* **Per-connection unbounded buffers.** A slow client should not kill the server's memory. Shed, coalesce, or block.
* **Reconnect-then-refetch-everything.** If the full state refresh is the recovery strategy, the protocol is broken. Use resumption tokens.
* **Ad-hoc event schemas.** Real-time events are contracts. They belong in AsyncAPI specs, versioned, generated.
* **Client-side reconciliation of echoes.** Echo suppression on the client is a fallback, not a design. Handle it at the gateway where the full context is available.

## Further reading [#further-reading]

* *Designing Data-Intensive Applications*, Martin Kleppmann — the chapters on streaming, ordering, and exactly-once semantics.
* *The Little Book of Semaphores*, Allen B. Downey — the fundamentals of flow control.
* *High Performance Browser Networking*, Ilya Grigorik — the chapters on WebSocket and real-time transports.
* The WebSocket RFC (RFC 6455) — worth reading at least once if you are going to build on top of it.


# Core API (/docs/reference/api/core)


# Core API Reference [#core-api-reference]

Our REST endpoints are governed by strict OpenAPI contracts. The complete technical reference—including schemas, authentication, and response payloads—is hosted in a dedicated, full-screen [Scalar](https://scalar.com) interactive viewer.

<br />

<a href="/api/core.html" target="_blank" rel="noreferrer">
  **Open Core API Reference ↗**
</a>


# ML API (/docs/reference/api/ml)


# ML API Reference [#ml-api-reference]

Our machine learning endpoints are governed by strict OpenAPI contracts. The complete technical reference—including voice processing payloads and transcription schemas—is hosted in a dedicated, full-screen [Scalar](https://scalar.com) interactive viewer.

<br />

<a href="/api/ml.html" target="_blank" rel="noreferrer">
  **Open ML API Reference ↗**
</a>


# Pub/Sub Events (/docs/reference/events/core-pubsub)


# Pub/Sub Events Schema [#pubsub-events-schema]

All internal, asynchronous service-to-service messaging is governed by strict AsyncAPI contracts using the **CloudEvents v1.0** envelope standard. For optimal readability and navigation, the complete catalog of Pub/Sub topics, messages, and payload structures is hosted in a dedicated, full-screen interactive viewer.

<br />

<a href="/api/core-pubsub-events.html" target="_blank" rel="noreferrer">
  **Open Pub/Sub Events Viewer in New Tab ↗**
</a>


# WebSocket Events (/docs/reference/events/core-ws)


# WebSocket Events Schema [#websocket-events-schema]

All client-facing, bidirectional WebSocket traffic is governed by strict AsyncAPI contracts using the **CloudEvents v1.0** envelope standard. For optimal readability and navigation, the complete catalog of WebSocket events, commands, and payload structures is hosted in a dedicated, full-screen interactive viewer.

<br />

<a href="/api/core-ws-events.html" target="_blank" rel="noreferrer">
  **Open WebSocket Events Viewer in New Tab ↗**
</a>


# Pitch (/docs/work/_template/pitch)


# \[Bet Title] [#bet-title]

> **Status**: Pitch
>
> **Author**: \[Name]
>
> **Date**: \[YYYY-MM-DD]

***

## Problem [#problem]

*What is the observed pain? Who experiences it, in what situation? Be specific — reference real user behaviour, not hypothetical users. Avoid jumping to the solution.*

*The practical consequences — what users can't do today, what workarounds they use, what signal is being lost.*

*Why the existing system doesn't solve it already.*

***

## Why Now [#why-now]

*What has changed that makes this the right time? Infrastructure, user behaviour, competitive pressure, accumulated cost? This section justifies why this problem rises to the top of the list now, not later.*

***

## Proposed Solution [#proposed-solution]

*The rough shape of the solution — fat marker, not fine-grained. Enough to agree on direction without removing meaningful implementation decisions. Diagrams encouraged. Implementation detail is not.*

## Rabbit Holes [#rabbit-holes]

*Approaches already considered and ruled out. These are plausible-looking solutions that would blow the appetite or the scope — name each one and explain why. Also include infrastructure assumptions the bet makes that could tempt someone into building a larger solution (e.g., "building a real-time backplane when sticky sessions suffice").*

* *Example: \[approach] — looks natural but would require \[expensive thing], which takes this out of appetite*

***

## No-Gos [#no-gos]

*Capabilities explicitly out of scope for this version. Include both obvious exclusions and natural extensions that users would reasonably expect but that don't belong in this bet — these are the most important no-gos because they're the ones most likely to creep in. Be specific about what's excluded and why.*

* *Example: \[obvious exclusion] — separate problem, different platform constraints*
* *Example: \[natural extension users would expect] — adds \[complexity] without justifying the appetite increase for v1*


# Problem Statement (/docs/work/_template/problem-statement)


# Problem Statement [#problem-statement]

> **Status**: Draft | Approved
> **Author**: *your name*
> **Date**: *YYYY-MM-DD*

## Problem [#problem]

*The specific friction being addressed. Who experiences it? What evidence do you have that it is real? One paragraph maximum.*

*The "who" can be internal — the engineering team, the system's reliability, the business's compliance posture. Platform and infrastructure gaps discovered during feature design are valid problem statements. Frame the pain in terms of what it blocks or what it costs, not just the technical gap.*

## Appetite [#appetite]

*How much time this problem is worth — given everything else you could be doing. State the time budget and the reasoning behind it, not just the number.*

> Appetite is not an estimate. It is an opportunity cost judgment made before the solution is defined.

## Why Now [#why-now]

*The reason this is worth solving in the next cycle rather than a future one.*


# Delivered Bets (/docs/work/delivered)


# Delivered Bets [#delivered-bets]

Once an active **Bet** is fully integrated, deployed, and validated against its target outcomes, it transitions here into the **Archive of Delivered Bets**.

These bets are considered feature-complete as defined by their original pitch boundaries. They remain accessible as historical references for architecture, product decisions, and testing implementations.


# Pitch (/docs/work/meeting-recording/pitch)


# Meeting Recording [#meeting-recording]

> **Status**: Active
>
> **Author**: Ryan Nel
>
> **Date**: 2026-04-26

***

## Problem [#problem]

Users of WordLoop cannot capture a meeting as it happens. Today, the only way a meeting enters the system is via audio file upload — a user records externally, exports the file, then imports it. There is no live capture flow.

The practical consequences are real:

1. **Upload friction** — a user must run a separate recording tool, remember to export the file, and then import it into WordLoop. The cognitive overhead is real enough that most users don't bother for shorter or informal conversations.
2. **No real-time feedback** — there is no visibility into what is being captured while a meeting is happening. Insights only appear after a full batch pipeline finishes.
3. **Lost signal** — short conversations that don't warrant a formal recording never enter the system at all, even when they contain decisions that matter.

The ML infrastructure to support live transcription is already operational — it is not wired to any live input path.

***

## Why Now [#why-now]

The ML service (AssemblyAI, speaker diarisation, task extraction) is proven through the file upload path. Building live capture now means the hard AI work is already done — this problem connects a live input wire to existing infrastructure. Every cycle we wait, users form habits around workarounds. Without live capture, WordLoop is a post-hoc analysis tool. With it, it becomes something you open at the start of a meeting.

***

## Proposed Solution [#proposed-solution]

The solution introduces a live capture flow directly into the browser, feeding real-time audio through the existing ML pipeline via streams/WebSockets. User interaction has three primary surfaces:

### 1. The Entry Point [#1-the-entry-point]

The existing "New Meeting" button is expanded into a dropdown menu to offer a choice between uploading a file and starting a live recording.

<img alt="Start Recording Dropdown" src="__img0" />

### 2. The Active Recording View [#2-the-active-recording-view]

A focused, distraction-free workspace that serves as the primary interface during a live meeting. It provides real-time feedback that the system is capturing and understanding the conversation.

<img alt="Active Recording UI Layout" src="__img1" />

**Key Interactions:**

* **Live Notes:** Private scratchpad for the user.
* **Context Panel (Right):** Real-time topic summaries, running transcript, and captured action items.

### 3. The Meeting Summary (Post-Recording) [#3-the-meeting-summary-post-recording]

Once the meeting ends (Stop & Save), the user is taken to the standard Meeting Overview page. The design aligns with the existing Upload flow, but represents the final persistence of the live event.

**Overview Tab**
<img alt="Meeting Summary Overview Tab" src="__img2" />

**Transcript Tab**
<img alt="Meeting Summary Transcript Tab" src="__img3" />

***

## Rabbit Holes [#rabbit-holes]

**Building a server-side mixing/composition step during the live session.** The audio stays chunked on GCS until the session ends — the composition step only runs once, at stop time. Attempting to maintain a merged file during the session adds write contention and unnecessary complexity.

**Trying to preserve real-time transcript quality for the post-meeting view.** The live transcript is intentionally low-accuracy (streaming, for latency). Post-meeting re-processing replaces it entirely. Attempting to "patch" the live transcript rather than replace it would be complex and fragile.

**Speaker voice profile enrichment as a live operation.** Matching a voice during a session is necessary; enriching the enrolled profile with new embeddings is not. Profile updates happen during post-meeting processing when all segment embeddings are available. Doing this live adds latency and complicates the speaker matching hot path.

**Persisting OPFS data beyond the current session.** The OPFS buffer is a transport safety net, not a permanent store. It should be cleared as soon as Core confirms GCS receipt. Treating it as a backup or replay store is out of scope.

***

## No-Gos [#no-gos]

* **Recording from mobile browsers (this bet).** `MediaRecorder` with reliable WebM chunk output and OPFS `createSyncAccessHandle()` are desktop browser capabilities. Mobile browsers have weaker support for both. This bet builds for desktop (Chrome/Edge primary, Safari 17+ best-effort). However, mobile recording is an important future capability — the architecture should not make choices that preclude it. If local audio buffering (OPFS) is not viable on mobile, the system should degrade to a direct-stream-only mode without the safety net. This is scoped deliberately so that a future mobile bet can extend the existing infrastructure rather than rebuild it.
* **Multi-device recording for a single meeting.** One active recording session per meeting. Merging audio streams from multiple devices is not part of this bet.
* **Live collaboration on notes.** Notes are a private per-user scratchpad. Real-time multi-user editing (like Google Docs) is not part of this bet.
* **Exporting or downloading the recording.** The audio is stored in GCS and made available for playback via signed URL. Download/export as a feature is not part of this bet.
* **Custom vocabulary or transcription hints.** AssemblyAI's default transcription model is used as-is. Custom vocabulary tuning or domain-specific language models are out of scope.
* **Pause/resume during a live recording.** A recording runs continuously from start to stop. Pause introduces session-split complexity (multiple audio segments, transcript gap handling, timer semantics) that doesn't justify the cost for v1. Captured as a separate problem statement — users will want to "go off record" temporarily.
* **System audio capture (`getDisplayMedia`).** This bet captures the user's microphone only. Capturing system audio (e.g., a Zoom call playing through speakers) requires display media permissions and a different audio routing pipeline. Captured as a separate problem statement — this is the natural next capability after mic-only recording.


# Pitches (/docs/work/pitches)


# Pitches [#pitches]

A **Pitch** is the evolution of a Problem Statement. It means the problem has been validated as worth solving, and a rough boundary has been drawn around a potential solution.

We shape pitches by defining the problem, outlining a proposed solution, setting a fixed appetite (e.g., 6 weeks), identifying clear rabbit holes to avoid, and declaring definitive no-gos to prevent scope creep.

When a Pitch is fully shaped, it becomes a candidate for funding. A funded Pitch becomes an active **Bet**.


# Backplane (/docs/work/problem-statements/backplane)


# Backplane [#backplane]

> **Status**: Draft
> **Author**: Ryan Nel
> **Date**: 2026-05-01

## Problem [#problem]

WebSocket events from Core can only reach the client connected to the pod that holds the session. There is no pub/sub backplane between Core pods. Any feature that uses WebSocket push — live recording, future notifications, live collaboration — is constrained to sticky-session load balancing. A pod restart drops all its sessions with no migration path. The engineering team discovered this constraint while designing the Meeting Recording bet, where it was documented as a design decision rather than solved inline.

## Appetite [#appetite]

3–4 weeks. The problem is well-understood (Redis Pub/Sub or NATS between pods), but the solution touches Core's connection management, deployment topology, and every feature that uses WebSocket push. The risk is not the backplane itself — it's the integration surface across existing features.

## Why Now [#why-now]

Meeting Recording is the first feature to rely heavily on WebSocket push. Sticky sessions work at current scale, but the constraint is now visible and documented. Solving it before a second real-time feature ships avoids retrofitting two features instead of one.


# Data Retention & Deletion (/docs/work/problem-statements/data-retention-and-deletion)


# Data Retention & Deletion [#data-retention--deletion]

> **Status**: Draft
> **Author**: Ryan Nel
> **Date**: 2026-05-01

## Problem [#problem]

There is no cascade-delete path for a meeting. Deleting a meeting record in PostgreSQL leaves behind: GCS audio chunks, the composed audio file, voice embeddings used for speaker matching, and any LLM-generated artefacts (summaries, talking points). There are no lifecycle policies for temporary objects (OPFS buffers, pre-compose chunks beyond the 24h GCS TTL). There is no GDPR-compliant "right to erasure" flow that ensures all derived data is removed. The compliance posture of the platform degrades with every recording stored.

## Appetite [#appetite]

2–3 weeks. The scope is a delete cascade across PostgreSQL, GCS, and any embedding store, plus lifecycle policies for ephemeral objects. An audit trail for compliance verification may extend this.

## Why Now [#why-now]

Meeting recordings contain sensitive audio data — conversations, decisions, and potentially legally protected speech. The longer recordings accumulate without a deletion mechanism, the larger the compliance surface grows. This should be in place before live recording reaches production users.


# Problem Statements (/docs/work/problem-statements)


# Problem Statements [#problem-statements]

The first step in our delivery lifecycle is acknowledging a problem.

A **Problem Statement** is an articulation of friction, inefficiency, or an unmet need. It does not dictate a solution. It merely establishes that something requires attention and validates that the problem actually exists with objective evidence.

If a Problem Statement survives scrutiny and proves worth solving, it can be shaped into a **Pitch**.


# ML→Core Write-Back Resilience (/docs/work/problem-statements/ml-writeback-resilience)


# ML→Core Write-Back Resilience [#mlcore-write-back-resilience]

> **Status**: Draft
> **Author**: Ryan Nel
> **Date**: 2026-05-01

## Problem [#problem]

After post-meeting processing completes, ML writes results to Core REST: transcript segments, synthesis artefacts (headline, summary, topics, talking points), transcription status transitions, and system-generated tasks. If Core is unavailable at delivery time, ML retries with exponential backoff. If Core remains down beyond the retry budget, results are lost — the user sees a permanently stuck "Processing…" state with no recovery path.

The same risk applies to live-session write-backs (transcript segment appends, talking points, tasks) if the Core REST endpoints become unreachable while the ML WebSocket remains connected. Today, ML has no durable store for pending write-backs and no mechanism to resume delivery after a prolonged Core outage.

## Appetite [#appetite]

2–3 weeks. The core problem is well-understood (durable outbox or Pub/Sub fallback), but the solution must handle idempotent delivery, ordering constraints (status transitions must arrive in order), and partial-failure scenarios (some artefacts delivered, others not).

## Why Now [#why-now]

The Meeting Recording bet is the first feature where ML produces durable artefacts that users expect to appear reliably. During live recording, the ML WebSocket provides a natural retry path — if the WebSocket is connected, Core is reachable. But post-meeting processing runs asynchronously via Pub/Sub, and the write-back window may not overlap with Core availability. The current retry-and-hope approach is acceptable for a first release, but the failure mode is user-visible and unrecoverable without manual intervention.


# Orphaned Meeting Cleanup (/docs/work/problem-statements/orphaned-meeting-cleanup)


# Orphaned Meeting Cleanup [#orphaned-meeting-cleanup]

> **Status**: Draft
> **Author**: Ryan Nel
> **Date**: 2026-05-01

## Problem [#problem]

The live recording startup flow creates the meeting resource via REST before initiating the recording via WebSocket. If the recording startup fails — ML is unavailable, AssemblyAI is unreachable, or the browser closes during the handshake — the user is left with an empty meeting in their list: no recording, no transcript, no audio. There is no automatic cleanup, retry mechanism, or user-facing recovery path. The user must manually delete the orphaned meeting.

This is a known trade-off from the Meeting Recording bet. The meeting-first design was chosen because the meeting ID is required before any recording state can be created, and making the two operations atomic (single request) would couple REST and WebSocket lifecycles. For v1, the user is expected to clean up orphaned meetings themselves.

## Appetite [#appetite]

1–2 weeks. The solution space ranges from a simple client-side retry flow (try recording startup again on the same meeting) to server-side cleanup (garbage-collect meetings with no recording after a timeout). The right approach depends on how frequently this failure occurs in practice — instrumentation from v1 should inform the design.

## Why Now [#why-now]

This problem was identified during the design review of the Meeting Recording bet and deferred deliberately. It should be revisited after the bet ships and real-world failure rates are observed.


# Pause & Resume (/docs/work/problem-statements/pause-resume)


# Pause & Resume [#pause--resume]

> **Status**: Draft
> **Author**: Ryan Nel
> **Date**: 2026-05-01

## Problem [#problem]

A live recording runs continuously from start to stop. There is no way for a user to temporarily pause — to go "off record" for a sensitive aside, a personal conversation, or a break — without ending the entire session. Users who need a moment of privacy must choose between stopping the recording (losing the live session context) or continuing to record content they don't want captured. This is a natural expectation for any recording tool, and its absence will be felt immediately.

## Appetite [#appetite]

2–3 weeks. Pause introduces session-split complexity: multiple audio segments, transcript gap handling, timer semantics (does pause time count toward duration?), UI state transitions, and the OPFS buffer must handle segment boundaries cleanly. The design is well-scoped but touches the full audio pipeline.

## Why Now [#why-now]

Meeting Recording is shipping without pause/resume as a deliberate no-go — the complexity doesn't justify the cost for v1. But users will ask for this on day one. Capturing it now ensures the design accounts for it in sequencing rather than treating it as an afterthought.


# Recording Consent (/docs/work/problem-statements/recording-consent)


# Recording Consent [#recording-consent]

> **Status**: Draft
> **Author**: Ryan Nel
> **Date**: 2026-05-01

## Problem [#problem]

Users can start a live recording with no consent mechanism for other meeting participants. In many jurisdictions (EU, California, Illinois), recording a conversation without all-party consent is illegal. WordLoop currently provides no disclosure UI, no consent collection, and no legal guidance. A user recording a Zoom call through their microphone may unknowingly violate recording consent laws. The risk falls on both the user and the platform.

## Appetite [#appetite]

1–2 weeks. The solution is likely a pre-recording disclosure screen with configurable text, not a full consent-collection platform. Legal review of the disclosure language is the long pole.

## Why Now [#why-now]

Live recording ships without this, and users will use it immediately. Legal risk is proportional to adoption — every recording made without a consent mechanism increases exposure. This should ship alongside or very shortly after the recording feature.


# Replay Buffer Optimization (/docs/work/problem-statements/replay-buffer-optimization)


# Replay Buffer Optimization [#replay-buffer-optimization]

> **Status**: Draft
> **Author**: Ryan Nel
> **Date**: 2026-05-01

## Problem [#problem]

Core's WebSocket replay buffer retains the last 5 minutes of durable events per user. If the client reconnects after the buffer expires, Core sends a `ReplayExpiredEvent` and the client must do a full state re-fetch via REST. For a 2-hour meeting with thousands of transcript segments, hundreds of talking points, and dozens of tasks, this full re-fetch is expensive — both in latency and server load.

The current design uses a full refresh on replay expiry. This is acceptable for v1 because reconnects beyond 5 minutes are expected to be rare during active recording sessions (the OPFS buffer and automatic reconnect cover most short outages). However, as session lengths grow and the product adds more real-time state, the cost of full re-fetch will increase.

## Appetite [#appetite]

1–2 weeks. Potential approaches include: extending the replay buffer TTL, including `last_known_version` per entity type in `ReplayExpiredEvent` to enable incremental fetches, or a cursor-based catch-up mechanism that fetches only events the client missed.

## Why Now [#why-now]

This problem was identified during the design review of the Meeting Recording bet and deferred deliberately. The full-refresh approach is sufficient for v1. It should be revisited if monitoring shows that replay-expired reconnects are common or that the full re-fetch causes measurable latency or load.


# Session Recovery (/docs/work/problem-statements/session-recovery)


# Session Recovery [#session-recovery]

> **Status**: Draft
> **Author**: Ryan Nel
> **Date**: 2026-05-01

## Problem [#problem]

When a user accidentally closes the browser tab during a live recording, the session ends permanently. Post-meeting processing runs on whatever audio reached GCS up to that point. The OPFS shadow buffer — which captures every audio chunk locally — persists beyond tab close, but there is no mechanism to resume the session from it. The audio data is still on the device; the session state is not. Users who lose a tab to a browser crash, an accidental close, or a system restart lose their live session even though the audio was never lost.

## Appetite [#appetite]

2–3 weeks. The solution requires persisting session state (not just audio) to survive tab closure, a recovery detection flow on page load, and careful handling of the gap between where GCS left off and where OPFS continued. The OPFS infrastructure already exists — this extends it from audio-only resilience to full session resilience.

## Why Now [#why-now]

Meeting Recording documents this as a known design decision: "Session not resumable after tab close (v1)." The OPFS buffer already captures the audio — the gap is session state persistence. As recordings get longer and more valuable, accidental tab closure becomes a more painful data loss vector. This should follow shortly after the initial recording release.


# System Audio Capture (/docs/work/problem-statements/system-audio-capture)


# System Audio Capture [#system-audio-capture]

> **Status**: Draft
> **Author**: Ryan Nel
> **Date**: 2026-05-01

## Problem [#problem]

The Meeting Recording feature captures audio from the user's microphone only. For users in remote meetings — Zoom, Teams, Google Meet — the other participants' audio plays through speakers or headphones but is not captured by the microphone input. The user hears the full conversation; WordLoop hears only one side. This makes the transcription incomplete and the extracted insights unreliable for the most common meeting format: remote calls.

Capturing system audio requires `getDisplayMedia({ audio: true })`, which uses a fundamentally different browser permission model (display media, not user media), a different audio routing pipeline (system audio loopback vs. microphone input), and potentially mixing two audio sources (mic + system) into a single stream.

## Appetite [#appetite]

3–4 weeks. The browser APIs are well-documented but the permission UX differs across browsers and operating systems. The audio mixing pipeline (mic + system) needs careful handling to avoid echo and gain issues. This is the natural next capability after mic-only recording and the one users will expect most urgently.

## Why Now [#why-now]

Remote meetings are the dominant use case. Mic-only recording covers in-person conversations and the user's own voice, but the highest-value scenario — capturing a full remote meeting — requires system audio. This should be the next recording capability after the initial mic-only release.


# App Architecture Rules (/docs/learn/services/app/architecture)


# Architecture Rules: App Service (Next.js) [#architecture-rules-app-service-nextjs]

`wordloop-app` is a Next.js 16 App Router application. While frontend frameworks dictate specific file-system routing, we still deeply adhere to our overarching [Service Architecture Principles](/docs/principles/system-design/hexagonal-architecture) by mapping Hexagonal concepts to our React file tree. Strict adherence to these boundaries is mandatory.

## 1. Core Tooling [#1-core-tooling]

* **Package Manager:** Use `pnpm` exclusively.
* **Component Testing:** Use Vitest and React Testing Library. All asynchronous state mutations must be wrapped in `act(...)` to prevent race conditions.
* **Icons:** Use `lucide-react` exclusively.
* **Tailwind v4:** No `tailwind.config.ts`. All configuration sits in `app/globals.css` using `@theme` blocks.

## 2. Inward Dependencies & Routing (The Ports & Adapters Mapping) [#2-inward-dependencies--routing-the-ports--adapters-mapping]

The codebase follows a strict inward-facing dependency graph.

* **Entrypoints (`app/`):** React Server Components (RSC) act as the primary inbound adapters, parsing URL parameters and streaming Server Actions.
* **UI & Bets (`components/`):** Client and Server components. Cannot house business logic APIs.
* **Service/Domain (`hooks/` & `lib/core/`):** Local state execution, custom hooks, and pure utilities.
* **Providers (`lib/providers/`):** Secondary adapters. Schemas (Zod) and API generated clients that reach out to our backend. Cannot import from `components/` or `app/`.

## 3. The Backend API Proxy [#3-the-backend-api-proxy]

To ensure environment-agnostic Docker deployments without build-time variable inlining:

* All Orval-generated Next.js clients fetch relative URLs (`/api/...`).
* A Next.js Route Handler interceptor (`app/api/[...proxy]/route.ts`) acts as a server-side reverse proxy.
* It proxies incoming requests to the URL specified by `CORE_API_URL` at runtime.

## 4. Authentication (Clerk) [#4-authentication-clerk]

* **Identity:** We utilize `@clerk/nextjs` for all interactive flows.
* **Token passing:** The Orval fetcher instance intercepts requests to `/api/...` and injects the Clerk JWT inside the `Authorization` header.

## 5. State & Data Validation [#5-state--data-validation]

* **Server Actions:** All mutations use React Server Actions returning the Result pattern (`{ data: T, error: string | null }`).
* **Zod Schemas:** API contracts dictate the interface. Do not write handwritten TypeScript definitions. Always `z.infer` from the generated Zod schemas.
* **Exhaustive Matching:** `switch` statements must terminate with an exhaustive `default: never` check.


# Design Guide (/docs/learn/services/app/design-guide)


# Visual Design & CSS Execution Guide [#visual-design--css-execution-guide]

The visual language of this application merges the high-density, mathematical rigor of a Linear-inspired SaaS interface with a "Liquid Glass" physical aesthetic. It relies on a strictly constrained monochrome foundation (Midnight/Obsidian), highly intentional accent colours, and depth created through translucency, backdrop blurs, and multi-layered shadows.

## 1. Colour Architecture: The OKLCH System [#1-colour-architecture-the-oklch-system]

All colours must be defined and manipulated using the **OKLCH** colour space to ensure perceptual uniformity and accessible contrast scaling. Hexadecimal, RGB, and HSL are strictly forbidden. To allow for dynamic alpha transparency injection in CSS and Tailwind v4, colours are defined by their structural components within the `@theme` directive.

### The Midnight / Obsidian Palette [#the-midnight--obsidian-palette]

This dark-mode default minimises ambient chrome, relying on absolute darks and stark contrasts.

| Element          | CSS Variable Definition  | OKLCH Value     | Application Context                             |
| :--------------- | :----------------------- | :-------------- | :---------------------------------------------- |
| **Canvas**       | `--color-bg-primary`     | `0.15 0.01 285` | The absolute lowest z-index layer.              |
| **Surface**      | `--color-bg-secondary`   | `0.18 0.01 285` | Bento box cards, sidebars, elevated containers. |
| **Primary Text** | `--color-text-primary`   | `0.95 0.01 285` | Headings, active tabs, high-emphasis body text. |
| **Muted Text**   | `--color-text-secondary` | `0.45 0.02 285` | Metadata, timestamps, inactive UI labels.       |
| **Accent**       | `--color-accent`         | `0.70 0.14 260` | Primary buttons, active states, focus rings.    |
| **Success**      | `--color-success`        | `0.70 0.10 140` | Soft Sage Green for positive feedback.          |
| **Error**        | `--color-error`          | `0.60 0.12 20`  | Muted Rose for sophisticated error states.      |

### Dynamic Opacity Injection [#dynamic-opacity-injection]

To use these variables with opacity in Tailwind v4, leverage the native CSS `from` syntax:

```css
.card-glass {
  background-color: oklch(from var(--color-bg-secondary) l c h / 0.8);
}
```

## 2. Typographic Scaling (Geist) [#2-typographic-scaling-geist]

Typography utilises the **Geist** font stack. It must maintain microscopic legibility for data density while offering geometric elegance for large marketing surfaces. All sizing is executed in `rem` units to respect browser accessibility baselines (`1rem = 16px`).

Tracking (letter-spacing) is mathematically tied to size: as size increases, tracking decreases to maintain optical tightness.

| Typographic Level  | Size (rem / px)    | Weight | Line Height | Tracking   | Use Case                             |
| :----------------- | :----------------- | :----- | :---------- | :--------- | :----------------------------------- |
| **Display Header** | `3.5rem` (56px)    | `600`  | `1.0`       | `-0.025em` | Hero sections, main view titles.     |
| **Section Header** | `2.0rem` (32px)    | `600`  | `1.2`       | `-0.025em` | Modal headers, bento card titles.    |
| **Body Primary**   | `1.0rem` (16px)    | `400`  | `1.5`       | `-0.011em` | Standard paragraphs, descriptions.   |
| **UI Control**     | `0.875rem` (14px)  | `500`  | `1.2`       | `-0.011em` | Buttons, navigation links, tabs.     |
| **Micro Data**     | `0.6875rem` (11px) | `500`  | `1.2`       | `0.05em`   | Metadata, badges (always uppercase). |

## 3. Spatial Architecture: The 8-Point Grid [#3-spatial-architecture-the-8-point-grid]

All margins, paddings, gap spacings, and layout dimensions adhere to an 8-point base grid. This prevents sub-pixel rendering blur on high-density displays.

| Scale Step       | Pixel Equivalent | REM Equivalent | CSS Variable Token |
| :--------------- | :--------------- | :------------- | :----------------- |
| **Sub-Grid**     | 4px              | `0.25rem`      | `--space-1`        |
| **Base Step**    | 8px              | `0.5rem`       | `--space-2`        |
| **Step 1.5**     | 12px             | `0.75rem`      | `--space-3`        |
| **1x Spacing**   | 16px             | `1.0rem`       | `--space-4`        |
| **1.5x Spacing** | 24px             | `1.5rem`       | `--space-6`        |
| **2x Spacing**   | 32px             | `2.0rem`       | `--space-8`        |

## 4. Layout Paradigm: The Bento Box Grid [#4-layout-paradigm-the-bento-box-grid]

The structural arrangement of components relies heavily on heavily compartmentalised cards that adhere to a foundational 12-column CSS Grid. Visual hierarchy is established by the surface area a card occupies.

```css
.bento-grid {
  display: grid;
  gap: var(--space-4); /* 16px */
  grid-template-columns: repeat(12, minmax(0, 1fr));
  grid-auto-rows: 90px; /* Establishes vertical rhythm */
}

/* A high-priority card visually dominating the layout */
.card-primary {
  grid-column: span 8;
  grid-row: span 4;
}
```

## 5. Surface Dynamics: Borders, Radii, and Elevation [#5-surface-dynamics-borders-radii-and-elevation]

Because the colour palette is heavily constrained, the UI relies on optical illusions of depth to separate cards from the background canvas.

### The 1px Subtle Border Illusion [#the-1px-subtle-border-illusion]

In dark mode environments, standard borders appear harsh. Define edges using heavily transparent white strokes to simulate a microscopic physical bevel.

```css
.surface-border {
  border: 1px solid oklch(1 0 0 / 0.08); /* Pure white, 8% opacity */
}
```

### Concentric Corner Radii [#concentric-corner-radii]

Excessive rounding feels consumer-centric; tighter rounding conveys engineering precision.

* Outer cards use `8px` (`0.5rem` / `--space-2`) radii.
* Internal nested elements scale down to `4px` (`0.25rem` / `--space-1`).
* *Mathematical Rule:* Inner Radius = Outer Radius - Padding.

### Multi-Layered Drop Shadows & Inset Lighting [#multi-layered-drop-shadows--inset-lighting]

Single-layer shadows fail to mimic real-world lighting. Use a 4-layer stack simulating ambient occlusion to diffuse scatter, paired with an inset rim light on the top edge.

```css
.card-elevated {
  box-shadow: 
    0 2px 4px oklch(0 0 0 / 0.4),    /* Ambient Occlusion */
    0 4px 8px oklch(0 0 0 / 0.3),    /* Direct Shadow */
    0 8px 16px oklch(0 0 0 / 0.2),   /* Soft Penumbra */
    0 16px 32px oklch(0 0 0 / 0.1),  /* Diffuse Scatter */
    inset 0 1px 0 oklch(1 0 0 / 0.05); /* Top Rim Light */
}
```

## 6. Ambient Lighting & Mesh Textures [#6-ambient-lighting--mesh-textures]

To break the starkness of large bento cards, employ "Ambient Mesh Textures." These are subtle washes of colour that draw the eye without overwhelming the monochrome palette.

* **No Hard Edges:** Gradients must fade to `transparent` using percentages (`50%` to `70%`) to prevent visual banding.
* **Low Chroma:** Injected colours must maintain a low saturation in the OKLCH space.
* **Layering:** The base surface colour (`--color-bg-secondary`) must always be declared last as the solid fallback layer.

## 7. Glassmorphism & Neon Glow Effects [#7-glassmorphism--neon-glow-effects]

For global chrome elements (like the Command Palette or sticky headers) and active state highlights.

```css
/* Glassmorphism Surface */
.glass-panel {
  background: oklch(from var(--color-bg-secondary) l c h / 0.65);
  backdrop-filter: blur(12px);
  border-bottom: 1px solid oklch(1 0 0 / 0.08);
}

/* Neon Glow pseudo-element placed strictly behind active components */
.neon-glow::before {
  content: "";
  position: absolute;
  inset: 0;
  background: linear-gradient(to right, oklch(0.6 0.25 290), oklch(0.65 0.25 330));
  filter: blur(20px);
  opacity: 0.5;
  z-index: -1;
}
```

## 8. Implementation & Class Usage Hierarchy [#8-implementation--class-usage-hierarchy]

The application relies on specific utility classes defined in `globals.css`. Understanding *when* and *where* to apply these classes ensures optical consistency and preserves the Z-index hierarchy.

### Core Surfaces (The Z-Index Hierarchy) [#core-surfaces-the-z-index-hierarchy]

You have two distinct levels of "glass." Do not mix them arbitrarily.

* **`.glass-surface`**: The architectural workhorse.
  * **When to use:** Use this for standard bento box cards, sidebars, or any static container resting directly on the main canvas (`bg-background`).
  * **Visuals:** It bets a tighter blur and a shallower ambient shadow stack.
* **`.glass-elevated`**: The overlay.
  * **When to use:** Reserve this strictly for floating elements that detach from the primary layout. This includes the **Command Palette**, dropdown menus, tooltips, and modal dialogs.
  * **Visuals:** It bets a heavy 20px blur and a deeper, more diffuse shadow stack to physically "lift" it closer to the user.

### Ambient Texture Application (The "Glow" Cards) [#ambient-texture-application-the-glow-cards]

These classes override the standard `.glass-surface` to inject radial gradients. They should be used sparingly to create visual anchors and guide the user's eye.

* **`.card-insight` (The Hero Glow)**
  * **When to use:** Apply this *only* to the primary "Hero" card of a view (e.g., the "Daily Insight" or main overview brief).
  * **Visuals:** Injects a dual-tone purple and magenta ambient glow to create a premium, atmospheric anchor.
* **`.card-accent-glow` (The Action Glow)**
  * **When to use:** Use this for cards that require immediate primary user action, or to highlight a singular, highly important metric.
  * **Visuals:** Projects a subtle wash of the Cobalt accent colour from the top-center.
* **`.card-success-glow` (The Positive State)**
  * **When to use:** Apply dynamically when a bento card achieves a "completed" state (e.g., inbox zero, all tasks done).
  * **Visuals:** Projects a soft, organic Sage Green wash to calm the interface and signal positive feedback without relying on harsh alerts.


# App Implementation Guide (/docs/learn/services/app/implementation)


# App Implementation Guide (Next.js) [#app-implementation-guide-nextjs]

This guide translates WordLoop's overarching Engineering Principles into explicit, copy-pasteable React/Next.js code for the `wordloop-app` service.

## 1. Concrete Trace-First Development [#1-concrete-trace-first-development]

Next.js automatically instruments App Router requests with OpenTelemetry. However, when we perform background mutations or explicit fetch requests to the backend proxy, we dynamically enrich the trace.

### Identity Propagation (Baggage) [#identity-propagation-baggage]

The frontend is responsible for injecting the authenticated user's ID into the W3C Baggage header. This guarantees that all downstream services (Core, ML) can attribute their database queries directly back to the user without fetching identity twice.

```typescript
// lib/providers/fetcher.ts
import { auth } from '@clerk/nextjs/server';

export const customFetch = async (url: string, options: RequestInit) => {
  const { userId, getToken } = await auth();
  const token = await getToken();

  const headers = new Headers(options.headers);
  if (token) headers.set('Authorization', `Bearer ${token}`);
  if (userId) headers.set('Baggage', `enduser.id=${userId}`);

  return fetch(url, { ...options, headers });
};
```

## 2. Concrete Error Handling (Server Actions) [#2-concrete-error-handling-server-actions]

We never throw naked exceptions from Server Actions to Client Components, as this causes hard React Error Boundary crashes. We utilize the **Result Pattern** to treat errors as standard data.

### The Result Pattern [#the-result-pattern]

Server Actions return an explicitly typed object containing either the data or the error message, forcing the frontend component to handle failure states gracefully.

```typescript
// app/actions/meetings.ts
'use server'

import { getMeetingClient } from '@/lib/providers/api';

export type ActionState<T> = 
  | { success: true; data: T }
  | { success: false; error: string };

export async function createMeetingAction(formData: FormData): Promise<ActionState<string>> {
  try {
    const title = formData.get('title') as string;
    const client = await getMeetingClient();
    
    // Attempt the mutation via generated P&A Provider
    const result = await client.createMeeting({ title });
    
    return { success: true, data: result.meetingId };
  } catch (error) {
    // Map network/backend errors to a graceful UI message
    return { success: false, error: "Failed to create meeting. Please try again." };
  }
}
```

### Graceful Component Degradation [#graceful-component-degradation]

The React component consumes this pattern directly without needing `try/catch` blocks.

```tsx
// components/submit-button.tsx
'use client'

import { createMeetingAction } from '@/app/actions/meetings';
import { useState } from 'react';

export function SubmitMeeting() {
  const [error, setError] = useState<string | null>(null);

  const handleSubmit = async (formData: FormData) => {
    const result = await createMeetingAction(formData);
    if (!result.success) {
      setError(result.error);
      return;
    }
    // Handle success (e.g., router.push)
  };

  return (
    <form action={handleSubmit}>
      {error && <div className="text-red-500">{error}</div>}
      <button type="submit">Create</button>
    </form>
  );
}
```

## 3. Concrete Dependency Injection (Providers) [#3-concrete-dependency-injection-providers]

Rather than handwriting brittle `fetch` calls scattered across multiple UI components, we rely entirely on purely generated API clients.

### Using the Generated Orval Client [#using-the-generated-orval-client]

`Orval` reads our OpenAPI spec and generates pure TypeScript hooks and fetchers. These act as our "Providers" in the Clean Architecture context. Components (the Domain) use them without caring about the underlying HTTP mechanism.

```tsx
// components/meeting-list.tsx
'use client'

// 1. Import the generated Provider
import { useGetMeetings } from '@/lib/providers/generated/wordloop';

export function MeetingList() {
  // 2. The Provider abstracts SWR caching, headers, and type validation
  const { data, error, isLoading } = useGetMeetings();

  if (isLoading) return <Skeleton />
  if (error) return <ErrorMessage error={error} />
  
  // 3. Types are guaranteed perfectly backwards compatible with Core
  return (
    <ul>
      {data.meetings.map(m => <li key={m.id}>{m.title}</li>)}
    </ul>
  )
}
```

## 4. Idiomatic React & TypeScript Standards [#4-idiomatic-react--typescript-standards]

We do not aim to rewrite foundational guidance on writing excellent React and TypeScript code. Instead, we adhere to established industry baselines mapped to our internal engineering principles.

We expect all Wordloop App engineers to intimately understand:

* [Next.js App Router Documentation](https://nextjs.org/docs/app) for framework-dictated rendering boundaries.
* [Total TypeScript Patterns](https://www.totaltypescript.com/) for strict TypeScript fundamentals.

Below is concrete guidance on how overarching TS/React idioms manifest as system-enforced architectural invariants.

### Default to Server Components (Clean Architecture) [#default-to-server-components-clean-architecture]

**The React Idiom:** Start with React Server Components (RSC) and only use `'use client'` at the absolute leaf nodes.\
&#x2A;*The Principle Connection:** As defined in our [Service Architecture](/docs/principles/system-design/hexagonal-architecture), RSCs act as our "Inbound Adapters." They handle pure data fetching securely on the backend without exposing network waterfalls to the client. This enforces a strict separation where UI interactivity (Client components) is totally decoupled from data orchestration.

```tsx
// app/meetings/page.tsx
// This is a Server Component by default. No 'use client' directive.

import { getMeetingClient } from '@/lib/providers/api';
import { MeetingList } from '@/components/MeetingList';

export default async function Page() {
  // 1. Data orchestration stays securely on the server.
  const client = await getMeetingClient();
  const meetings = await client.listMeetings();
  
  // 2. We pass pure data down to the interactivity leaf.
  return (
    <main>
      <h1>Your Meetings</h1>
      <MeetingList initialData={meetings} />
    </main>
  );
}
```

### Discriminated Unions for Predictable State (Resilience) [#discriminated-unions-for-predictable-state-resilience]

**The TypeScript Idiom:** Using strict discriminated union types instead of optional properties.\
&#x2A;*The Principle Connection:** We avoid `try/catch` UI crashes by mapping server actions to unified result patterns. Using discriminated unions guarantees the TypeScript compiler will force the frontend engineer to handle both states explicitly, leading to [Resilient Error Handling](/docs/principles/quality/reliability).

```typescript
// 1. The Discriminated Union explicitly separates the Success and Failure states.
export type ActionState<T> = 
  | { success: true; data: T }
  | { success: false; error: string };

// 2. The UI is forced to check the discriminator before accessing data.
function handleResponse(response: ActionState<Meeting>) {
  if (!response.success) {
    // TS knows 'response' only has an 'error' here.
    showToast(response.error);
    return;
  }
  
  // TS knows 'response' guaranteed has 'data' here.
  renderMeeting(response.data);
}
```


# App Service (Next.js) (/docs/learn/services/app)


# App Service (Next.js) [#app-service-nextjs]

`wordloop-app` is the frontend UI. Deployed on the **Next.js 16 App Router**, the application prioritizes React Server Components (RSC) to construct HTML payloads instantly on initial load, only falling back to Client Components for pure user interactivity.

## Architecture & Layout [#architecture--layout]

> \[!IMPORTANT]\
> The project enforces an absolute inward-facing dependency graph. Application routing logic can depend on internal business functions hooks, but business core functions must never depend on UI primitives.

```text
services/wordloop-app/
├── app/                  # Next.js App Router (Pages, Layouts, API Proxies)
├── components/           # UI Elements
│   ├── ui/               # Shadcn/Radix Primitives
│   └── <domain>/         # Bet-specific components
├── hooks/                # SWR hooks for client caching
├── lib/                  # Core Logic (NO UI DEPENDENCIES ALLOWED)
│   ├── schemas/          # Zod schema definitions
│   ├── api.ts            # Generated HTTP client
│   └── utils.ts          # Pure logic and Tailwind mergers
├── orval.config.ts       # Code generation rules
└── globals.css           # Tailwind v4 configuration
```

## Local Development Workflow [#local-development-workflow]

1. **Start System Infrastructure**
   ```bash
   ./dev start infra core ml
   ```
   *(Boots databases, memory layer, and all backing backend services)*

2. **Start Next.js**
   ```bash
   cd services/wordloop-app
   pnpm run dev
   ```
   Visit [http://localhost:4001](http://localhost:4001) in your browser.

## Development Guidelines [#development-guidelines]

* **Tailwind v4 First:** Token declaration is strictly CSS-first. Please review our [Design Guide](design-guide.mdx) for UI patterns.
* **Strict Boundary Checks:** Always review the rigid [Frontend Architecture Rules](architecture.mdx) before abstracting components or lifting state.


# UX Guide (/docs/learn/services/app/ux-guide)


# UX Design Guide: The Velocity Manifesto [#ux-design-guide-the-velocity-manifesto]

The interface is a transparent, frictionless layer between the user’s thought and their data. The application must feel weightless, preemptive, and immediately responsive. Every interaction is designed to keep the user in a state of uninterrupted flow.

## 1. Core Interaction Pillars [#1-core-interaction-pillars]

The fundamental rule of this application's UX is **velocity**. The UI must never ask permission to be useful or force the user to manage the system.

* **Flow-State Entry (Zero-Click):** Upon rendering any page or view, the primary input must be immediately focused. The user should be able to begin typing the millisecond the application loads without reaching for a mouse or trackpad.
* **Frictionless Inline Editing:** The distinction between "view mode" and "edit mode" is eliminated. Administrative tasks and content creation are the same action. Clicking any text element transforms it into an active input field in-place. Avoid dedicated edit screens or modals.
* **Pre-emptive Architecture:** The application must anticipate the user's next action. Upon saving or submitting an entry, the UI must reset to a "Ready" state instantly.
* **Optimistic UI:** Never make the user wait for a server response. Use optimistic updates to reflect changes in the UI instantaneously, handling data synchronization silently in the background.

## 2. Navigation & The Command Palette [#2-navigation--the-command-palette]

The primary interface should remain sparse, dedicating maximum screen real estate to the user's content. Global navigation and complex actions are abstracted away from static menus.

* **The Central "Brain":** The Command Palette is the operational core of the application.
* **Keyboard-First Dominance:** Users must be able to navigate between loops, trigger global actions, and modify settings entirely via keyboard shortcuts utilizing the Command Palette.
* **Context Switching:** The Command Palette allows users to instantly pivot between distinct tasks without losing their place in the primary interface.

## 3. User Perception & "Liquid Glass" [#3-user-perception--liquid-glass]

The mental model for the interface is a continuous, physical sheet of glass that has been layered, etched, or frosted.

* **Depth over Distance:** Visual hierarchy is communicated through translucency and backdrop blurring, not aggressive drop shadows.
* **The Theme Split:**
  * *Milk (Light Mode):* Designed to feel luminous and airy, simulating natural light passing through frosted acrylic.
  * *Obsidian (Dark Mode):* Designed to feel deep and ink-like, utilizing high-contrast text against soft, dark depth to focus user attention.
* **Spatial Rhythm:** Proximity dictates relationship. Maintain a strict, consistent grid for macro-spacing, but utilize aggressively tighter groupings for related internal elements to build immediate visual associations.

## 4. Interaction, Motion, & State Communication [#4-interaction-motion--state-communication]

Movement and state changes must feel organic, physical, and calm.

* **Liquid Transitions:** Elements do not snap instantly between states. Use organic easing curves so elements flow smoothly from one layout or state to the next.
* **Micro-Friction:** Interactive elements should respond to the user's presence. Use subtle scale shifts on hover or press to make buttons and cards feel malleable and tactile.
* **Dynamic Stacking Context:** When overlays (like the Command Palette) are invoked, the background must dynamically blur further. This visually "pushes" the main content deep into the background, reinforcing the physical layering of the glass interface and narrowing the user's focus.
* **Calm Feedback Loops:**
  * *Success:* Use soft, organic Sage Green. It signals completion calmly, avoiding high-tension, vibrating colors.
  * *Error:* Use a muted Rose. It provides a clear, sophisticated warning signal that remains integrated with the soft glass aesthetic without resorting to harsh "stoplight" reds.
* **Progressive Disclosure via Iconography:** Icons function strictly as wayfinders. To keep the interface sparse, action icons should remain hidden or low-opacity until the user hovers over the parent container, revealing functionality exactly when needed.


# Core Architecture Rules (/docs/learn/services/core/architecture)


# Architecture Rules: Core Service (Go) [#architecture-rules-core-service-go]

The `wordloop-core` service is the direct physical manifestation of our [Service Architecture Principles](/docs/principles/system-design/hexagonal-architecture). It strictly abides by Clean Architecture (Ports and Adapters) to protect business rules from infrastructure volatility.

All domain logic resides in `internal/`; external dependencies point inwards.

## 1. Layers & Dependency Flow [#1-layers--dependency-flow]

* **Domain (`internal/core/domain`):** Pure Go. Zero dependencies. Defines entities, validation rules, and sentinel errors.
* **Gateways (`internal/core/gateway`):** Interfaces (Contracts) defining external data access. Depends only on Domain.
* **Services (`internal/core/service`):** Business logic and orchestration. Depends on Domain and Gateways.
* **Providers (`internal/provider`):** Concrete implementations (Postgres, PubSub). Depends on Domain and Gateways.
* **Entrypoints (`internal/entrypoints`):** HTTP routes, JWT middleware, OpenAPI mappings. Depends on Domain and Services.

## 2. Context & Dependency Injection [#2-context--dependency-injection]

* **Context is King:** `context.Context` MUST be the first parameter of every boundary function. This is critical for OpenTelemetry trace propagation.
* **Constructor Injection:** Use `NewService(...)` functions returning concrete types while accepting interfaces. Avoid global singleton state to ensure code remains deterministic and testable.

## 3. Telemetry & Observation [#3-telemetry--observation]

* **OpenTelemetry:** Every HTTP endpoint and background job must initialize a Root Span. Provider calls (e.g., executing a SQL query or publishing to Pub/Sub) must extract and cascade the span.
* **Logging:** Use `slog` exclusively. Always inject `trace_id` and `span_id` dynamically from the current context.

## 4. Authentication & Security [#4-authentication--security]

* **Clerk Identity:** Entrypoints must use the validated Clerk JWT middleware. Do not trust user IDs from requests; extract them from the authenticated context token.
* **Testing:** Provider layer testing requires actual Postgres containers (`testcontainers`). We prioritize absolute database fidelity; thus, we interact directly with real instances rather than mocking external state.

## 5. Migrations [#5-migrations]

* All database schema updates must be written as discrete `.sql` migration files in `scripts/migrations/`.
* State migrations are immutable. Rely solely on the programmatic runner (`./dev db migrate`).


# Core Implementation Guide (/docs/learn/services/core/implementation)


# Core Implementation Guide (Go) [#core-implementation-guide-go]

This guide translates WordLoop's overarching Engineering Principles into explicit, copy-pasteable Go code for the `wordloop-core` service.

## 1. Concrete Trace-First Development [#1-concrete-trace-first-development]

We rely on OpenTelemetry for all observability. Every inbound request starts a trace, and every outbound request cascades it.

### Initializing a Span [#initializing-a-span]

A new operation must start a span. If extracting from an HTTP Gin context, pass `c.Request.Context()`.

```go
import "go.opentelemetry.io/otel/trace"

func (s *TranscriptionService) Process(ctx context.Context, meetingID string) error {
	// 1. Start the span
	ctx, span := s.tracer.Start(ctx, "TranscriptionService.Process")
	
	// 2. Guarantee it closes
	defer span.End()

	// 3. Enrich the span with concrete, searchable attributes
	span.SetAttributes(attribute.String("meeting.id", meetingID))
	
	// ... logic
}
```

### Passing Context [#passing-context]

**Context is King.** Do not store context in structs. Pass it as the first parameter to every single Domain, Service, and Provider function. If you drop the context, you sever the distributed trace.

## 2. Concrete Error Handling [#2-concrete-error-handling]

We use Go's `errors.Is` capabilities combined with purely defined "Sentinel Errors" to prevent database or HTTP leakage into our Domain logic.

### Defining Sentinels [#defining-sentinels]

Define business rule errors in `internal/core/domain/errors.go`:

```go
package domain
import "errors"

var ErrMeetingNotFound = errors.New("meeting not found")
var ErrUnauthorized = errors.New("unauthorized access")
```

### Wrapping & Mapping Errors in Providers [#wrapping--mapping-errors-in-providers]

An Adapter (Provider) catching a third-party or infrastructure error must wrap it into a Domain error before returning it to the Service.

```go
package provider

import (
	"database/sql"
	"fmt"
	"wordloop-core/internal/core/domain"
)

func (r *PostgresMeetingStore) GetMeeting(ctx context.Context, id string) (*domain.Meeting, error) {
	var meeting domain.Meeting
	err := r.db.QueryRowContext(ctx, "SELECT ...").Scan(...)
	
	if err != nil {
		if errors.Is(err, sql.ErrNoRows) {
			// Map infrastructure error to Domain concept
			return nil, fmt.Errorf("provider execution failed: %w", domain.ErrMeetingNotFound)
		}
		return nil, fmt.Errorf("unexpected db error: %v", err)
	}
	return &meeting, nil
}
```

## 3. Concrete Dependency Injection [#3-concrete-dependency-injection]

We use interface injection (Ports) to satisfy dependencies.

### The Port (Defined by the Core) [#the-port-defined-by-the-core]

The interface belongs in `internal/core/gateway/`/`service` and is strictly defined using Domain language.

```go
package gateway

import "wordloop-core/internal/core/domain"

type MeetingStore interface {
	GetMeeting(ctx context.Context, id string) (*domain.Meeting, error)
}
```

### The Wiring (Entrypoint) [#the-wiring-entrypoint]

Constructor injection is used to assemble the pieces at startup without relying on globals.

```go
// 1. Initialize the concrete Provider
dbProvider := provider.NewPostgresMeetingStore(sqlDB)

// 2. Inject it into the Service (which only knows the Gateway Interface)
meetingService := service.NewMeetingService(dbProvider)

// 3. Inject the Service into the inbound HTTP route
entrypoints.RegisterMeetingRoutes(router, meetingService)
```

## 4. Idiomatic Go & Standards [#4-idiomatic-go--standards]

We do not aim to rewrite foundational guidance on writing excellent Go code. Instead, we adhere to established industry baselines and strictly map them to our internal engineering principles.

We expect all Wordloop Core engineers to intimately understand:

* [Effective Go](https://go.dev/doc/effective_go) for language fundamentals.
* [Uber Go Style Guide](https://github.com/uber-go/guide/blob/master/style.md) for practical, enterprise-grade formatting, concurrency, and pattern consensus.

Below is concrete guidance on how overarching Go idioms manifest as system-enforced architectural invariants.

### Accept Interfaces, Return Structs (Clean Architecture) [#accept-interfaces-return-structs-clean-architecture]

**The Go Idiom:** "Accept interfaces, return structs."\
&#x2A;*The Principle Connection:** This idiom is the bedrock of [Clean Architecture (Ports and Adapters)](/docs/principles/system-design/hexagonal-architecture). Gateways (Ports) define the interfaces. Services accept those interfaces. Providers return concrete struct representations.

```go
// 1. The Gateway (Port) is an interface
type Store interface {
	Get(ctx context.Context, id string) (*domain.Meeting, error)
}

// 2. The Service accepts the interface
func NewService(store Store) *Service {
	return &Service{store: store}
}

// 3. The Provider (Adapter) returns the concrete struct
type PostgresStore struct { /* ... */ }

func NewPostgresStore(db *sql.DB) *PostgresStore {
	return &PostgresStore{db: db}
}
```

### Goroutines and Context Loss (Trace-First) [#goroutines-and-context-loss-trace-first]

**The Go Idiom:** "Don't leave goroutines hanging, and always pass context."\
&#x2A;*The Principle Connection:** We practice [Trace-First Observability](/docs/principles/quality/observability). Executing a background goroutine without passing context severs the OpenTelemetry trace, blinding our dashboards to system behavior.

When spawning an asynchronous background task, use `context.WithoutCancel` (introduced in Go 1.21) or extract/inject the trace so the background span remains a child of the request trace—even if the HTTP client disconnects early.

```go
func (s *Service) ProcessAsync(ctx context.Context) {
	// Prevent the goroutine from dying if the HTTP request closes early,
	// but preserve the Trace Context so the background task is observable.
	bgCtx := context.WithoutCancel(ctx)
	
	go func() {
		_, span := s.tracer.Start(bgCtx, "ProcessAsync.Background")
		defer span.End()
		
		// Execute asynchronous domain work...
	}()
}
```

### Immutability in the Domain (Domain Purity) [#immutability-in-the-domain-domain-purity]

**The Go Idiom:** Receiver types (Pointer vs Value semantics).\
&#x2A;*The Principle Connection:** Our Domain layer must remain pure and free from unpredictable side effects.

When creating methods on Domain entities that calculate or evaluate state **rather than modifying it**, enforce immutability by exclusively using **value receivers**. This ensures core business rules remain deterministic, trivially unit-testable, and free from accidental pointer mutation.

```go
package domain

// Meeting is our core domain entity.
type Meeting struct {
	DurationSeconds int
	Status          string
}

// CalculateCost utilizes a value receiver (m Meeting) instead of a pointer (*Meeting).
// This guarantees the calculation logic cannot accidentally mutate the Meeting's state.
func (m Meeting) CalculateCost(rate float64) float64 {
	return float64(m.DurationSeconds) * rate
}
```


# Core Service (Go) (/docs/learn/services/core)


# Core Service (Go) [#core-service-go]

`wordloop-core` is the platform's system of record. It handles all database interactions, operational transactional logic, and asynchronous job orchestration.

> \[!IMPORTANT]
> The Core service exposes a strictly typed REST API via [Huma](https://huma.rocks), ensuring absolute contract adherence.

## Architecture & Layout [#architecture--layout]

The project strictly abides by Clean Architecture principles, enforcing strong boundaries between domain logic and side-effects.

```text
services/wordloop-core/
├── cmd/api/main.go          # Entrypoint (DI wiring & server boot)
├── internal/
│   ├── core/domain/         # Entities, Value Objects, Pure Logic
│   ├── core/service/        # Orchestration & Use-Cases
│   ├── entrypoints/         # HTTP Handlers (Huma), Clerk JWT Middleware
│   └── provider/            # Postgres, Pub/Sub, Storage Adapters
└── scripts/migrations/      # SQL up/down migrations
```

## Local Development Workflow [#local-development-workflow]

Run the Go server locally with standard tools and the consolidated CLI driver.

1. **Start Infrastructure Services**
   ```bash
   ./dev start infra
   ```
   *(Boots Postgres, Pub/Sub, Storage Emulators, and the OTel Aspire Dashboard)*

2. **Execute Database Migrations**
   ```bash
   ./dev db migrate
   ```

3. **Start the API Server**
   ```bash
   cd services/wordloop-core
   go run cmd/api/main.go
   ```

## Development Guidelines [#development-guidelines]

> \[!WARNING]\
> Always adhere to the [Core Architecture Rules](architecture.mdx).

If your changes expose new HTTP endpoints, you must regenerate the OpenAPI client before committing. Run:

```bash
./dev gen api
```


# Architecture Rules (/docs/learn/services/ml/architecture)


# Architecture Rule: Clean Architecture for WordLoop ML [#architecture-rule-clean-architecture-for-wordloop-ml]

## 1. Context & Scope [#1-context--scope]

This rule physically applies our [Service Architecture Principles](/docs/principles/system-design/hexagonal-architecture) to all Python code within `src/wordloop`.

We divide the code structurally based on its behavior: the core business logic remains isolated in `src/wordloop/core/`, inbound traffic is handled by `src/wordloop/entrypoints/`, and all external integrations belong in `src/wordloop/providers/`.

## 2. Architectural Layers (Inward Dependency Flow) [#2-architectural-layers-inward-dependency-flow]

**Dependencies must only point INWARD.** Inner layers must never import from outer layers.

### **Domain** (`src/wordloop/core/domain`) [#domain-srcwordloopcoredomain]

* **Purpose:** Business entities and core logic using `dataclasses` or `Pydantic`.
* **Zero-Dependency Core:** Standard library and Pydantic/Dataclasses only. No I/O.
* **Framework Agnostic Definitions:** Models must remain free from database-specific decorators or library-specific types (e.g., avoid `SQLAlchemy` ORM models here).
* **Universal Vocabulary:** Define core application exceptions (`src/wordloop/core/exceptions.py`) and constants here.
* **Testing:** Pure Unit Tests. Verify state transitions and business rules with zero mocks.

### **Gateways** (`src/wordloop/core/gateways`) [#gateways-srcwordloopcoregateways]

* **Purpose:** `typing.Protocol` or `abc.ABC` definitions that define **capabilities**.
* **Contractual Masters:** Gateways define *what* (e.g., `store`), never *how*.
* **No Leaky Abstractions:** Signatures must use **Domain** entities. Never reference SDK types (e.g., `openai.ChatCompletion`) or transport types.
* **The Golden Rule:** Use generic names. `publish(msg: Message)`, not `send_to_sqs(msg: Message)`.

### **Services** (`src/wordloop/core/services`) [#services-srcwordloopcoreservices]

* **Purpose:** Use-case orchestration. This is where the application "decides" what happens.
* **Dependency Injection:** Services depend on **Gateways** (Protocols/ABCs), not concrete Providers.
* **Protocol Consumer Rule:** Services should return concrete Domain objects. If a Service is used by an Entrypoint, the **Entrypoint** defines the Protocol/Interface it requires from the Service.
* **Transaction Boundaries:** Coordinate workflows by fetching data, applying domain logic, and persisting results.

### **Providers** (`src/wordloop/providers/`) [#providers-srcwordloopproviders]

* **Purpose:** Concrete implementations of Gateway interfaces (The "Adapter").
* **Mapping (Domain Alignment):** Translates external SDK responses into **Domain Entities**.
* **Error Wrapping:** Catch library-specific errors (e.g., `botocore.exceptions.ClientError`) and raise a corresponding **Core Exception** defined in the Domain layer.
* **Testing:** Integration Tests only. Use `testcontainers-python` to verify actual I/O against real instances.

### **Entrypoints** (`src/wordloop/entrypoints/`) [#entrypoints-srcwordloopentrypoints]

* **Purpose:** The interaction layer (FastAPI, CLI).
* **Boundary Validation:** Entrypoints validate inputs, map request schemas to Domain objects, and call a Service. They delegate all business decisions to the Core.
* **Testing:** Use `fastapi.testclient.TestClient` with mocked Services to verify routing and status codes.

***

## 3. Dependency Injection & State [#3-dependency-injection--state]

* **Constructor Injection:** Use `__init__` for all dependencies.
* **Explicit Lifecycles:** Initialize database clients solely at the entrypoint startup (e.g., `lifespan` in FastAPI) and inject them rather than relying on global singletons.
* **Wiring:** All concrete Provider-to-Service wiring happens at the outermost edge (the Entrypoint or a dedicated `container.py`).

***

## 4. Integrity & System Testing [#4-integrity--system-testing]

### **Bootstrap Verification (The "Smoke" Test)** [#bootstrap-verification-the-smoke-test]

To ensure the application is wired correctly:

* **Wiring Test:** A test in `tests/system/test_bootstrap.py` that attempts to initialize the full dependency tree.
* **Validation:** Ensures that all required environment variables are present and that the DI container (or manual wiring) doesn't fail on startup.

### **Golden Path System Tests** [#golden-path-system-tests]

* **Location:** `tests/system/`.
* **Strategy:** Run a live instance of the app (e.g., using `uvicorn` in a subprocess or `TestClient` with real providers) against real infrastructure via `testcontainers`.
* **Zero Mocks:** These tests verify the "Golden Thread" from the API route all the way to the database/third-party SDK.
* **Scope:** Focus strictly on high-value success paths.

***

## 5. Core Engineering Standards [#5-core-engineering-standards]

1. **Pydantic Everywhere:** Use Pydantic for all data boundaries (Request/Response and Domain).
2. **Structured Logging:** Utilize structural logging libraries rather than basic print statements to ensure trace fidelity.
3. **Acyclic Dependencies:** Prevent import cycles, as they act as an immediate signal of leaked layer responsibilities.
4. **Strict Boundaries:** Maintain clean layers by preventing FastAPI `Depends`, `Request`, or SQL `Session` objects from entering the Service or Domain layers.
5. **Clean Containers:** Always ensure `container.stop()` or similar cleanup is called in `pytest` fixtures to prevent resource leaks in CI.


# ML Implementation Guide (/docs/learn/services/ml/implementation)


# ML Implementation Guide (Python) [#ml-implementation-guide-python]

This guide translates WordLoop's overarching Engineering Principles into explicit, copy-pasteable Python code for the `wordloop-ml` service.

## 1. Concrete Trace-First Development [#1-concrete-trace-first-development]

We rely on OpenTelemetry for all observability. Because Python requires explicit context propagation in background tasks, we must properly extract and inject W3C Baggage.

### Initializing a Span [#initializing-a-span]

A new operation must start a span. In FastAPI, this is often handled automatically, but for background pipeline tasks, you must explicitly declare it.

```python
from opentelemetry import trace

tracer = trace.get_tracer(__name__)

def process_audio(meeting_id: str) -> None:
    # 1. Start the span
    with tracer.start_as_current_span("ProcessAudio") as span:
        # 2. Enrich the span with concrete attributes
        span.set_attribute("meeting.id", meeting_id)
        
        # ... processing logic
```

### Passing Context [#passing-context]

When publishing to Pub/Sub or calling another service, you must explicitly inject the current trace context into the HTTP headers or message attributes.

## 2. Concrete Error Handling [#2-concrete-error-handling]

We use explicit Python `Exception` subclasses defined in our pure Domain to prevent external SDK errors from polluting our business logic.

### Defining Sentinels [#defining-sentinels]

Define business rule errors in `src/wordloop/core/exceptions.py`. They should inherit from a base `WordLoopError`:

```python
class WordLoopError(Exception):
    """Base exception for all Wordloop errors."""
    def __init__(self, message: str, cause: Exception | None = None):
        super().__init__(message)
        self.cause = cause

class ModelInferenceError(WordLoopError):
    """Raised when an AI model fails to return a valid response."""
```

### Wrapping & Mapping Errors in Providers [#wrapping--mapping-errors-in-providers]

An Adapter (Provider) interacting with the AssemblyAI SDK or OpenAI SDK *must* catch the library-specific error and raise a pure Domain exception.

```python
import assemblyai as aai
from wordloop.core.exceptions import ModelInferenceError

class AssemblyAIProvider:
    def transcribe(self, url: str) -> str:
        try:
            transcript = aai.Transcriber().transcribe(url)
            if transcript.error:
                 raise ModelInferenceError(f"AssemblyAI failed: {transcript.error}")
            return transcript.text
        except aai.errors.AssemblyAIError as e:
            # Map infrastructure error to Domain concept
            raise ModelInferenceError("AssemblyAI SDK crashed", cause=e)

```

## 3. Concrete Dependency Injection [#3-concrete-dependency-injection]

We use Python's `Protocol` from the `typing` module to define Interfaces (Ports).

### The Port (Defined by the Core) [#the-port-defined-by-the-core]

The protocol belongs in `src/wordloop/core/gateways/` and strictly uses Domain language, completely ignorant of AssemblyAI or Postgres.

```python
from typing import Protocol
from wordloop.core.domain.models import TranscriptionResult

class TranscriptionProvider(Protocol):
    def transcribe(self, audio_uri: str) -> TranscriptionResult:
        ...
```

### The Wiring (Entrypoint) [#the-wiring-entrypoint]

Constructor injection is used. FastAPI's `Depends` system automatically resolves these during the request lifecycle.

```python
from fastapi import Depends
from wordloop.core.services import AudioService
from wordloop.providers.assembly import AssemblyAIProvider

# Dependency Injection function
def get_audio_service(
    provider: AssemblyAIProvider = Depends()
) -> AudioService:
    # The AudioService requires a TranscriptionProvider protocol!
    return AudioService(provider=provider)
```

## 4. Idiomatic Python & Standards [#4-idiomatic-python--standards]

We do not aim to rewrite foundational guidance on writing excellent Python code. Instead, we adhere to established industry baselines and mapping them to our internal engineering principles.

We expect all Wordloop ML engineers to understand:

* [PEP 8](https://peps.python.org/pep-0008/) for fundamental language syntax.
* [Google Python Style Guide](https://google.github.io/styleguide/pyguide) for enterprise-level structure and docstring consensus.

Below is concrete guidance on how overarching Python idioms manifest as system-enforced architectural invariants.

### Strict Typing over Duck Typing (Clean Architecture) [#strict-typing-over-duck-typing-clean-architecture]

**The Python Idiom:** Using strong static typing (`mypy`) instead of traditional dynamic duck-typing.\
&#x2A;*The Principle Connection:** [Clean Architecture (Ports and Adapters)](/docs/principles/system-design/hexagonal-architecture) relies heavily on explicit Contracts/Ports across boundaries. We enforce the use of `typing.Protocol` and strict type hints on all domain models to ensure dependency inversion is compile-time verifiable.

```python
from typing import Protocol
from dataclasses import dataclass

@dataclass(frozen=True)
class TranscriptionRequest:
    audio_url: str
    target_language: str

# 1. We use a strictly typed Protocol instead of relying on duck-typed methods.
class TranscriptionProvider(Protocol):
    def transcribe(self, request: TranscriptionRequest) -> str:
        ...
```

### Context Managers for Resource Leaks (Resilience) [#context-managers-for-resource-leaks-resilience]

**The Python Idiom:** Using `with` and `@contextmanager` for resource lifecycle management.\
&#x2A;*The Principle Connection:** We practice robust [Error Handling & Resilience](/docs/principles/quality/reliability). If an ML SDK or file stream throws an exception, failing to clean up memory or connections results in persistent leaks and eventual cluster death.

Always utilize Context Managers when handling stateful resources. This guarantees the `__exit__` cleanup executes even if your domain logic crashes.

```python
import tempfile
import os
from contextlib import contextmanager

@contextmanager
def temporary_audio_file(audio_bytes: bytes):
    """Context manager to ensure ephemeral files are always deleted after processing."""
    temp_path = tempfile.mktemp(suffix=".wav")
    try:
        with open(temp_path, "wb") as f:
            f.write(audio_bytes)
        yield temp_path
    finally:
        # This cleanup is guaranteed to run, preventing disk exhaustion.
        if os.path.exists(temp_path):
            os.remove(temp_path)

def process():
    # The file safely deletes itself the moment the block exits or throws.
    with temporary_audio_file(b"...") as path:
        result = run_inference(path)
```


# ML Service (Python) (/docs/learn/services/ml)


# ML Service (Python) [#ml-service-python]

`wordloop-ml` operates as the platform's stateless asynchronous execution engine. It is responsible for audio processing payloads, interfacing securely with ML APIs (such as AssemblyAI), and normalizing telemetry constraints.

<Callout type="info">
  The service exposes a synchronous REST interface via FastAPI but primarily executes within a custom worker consuming AsyncAPI Pub/Sub events.
</Callout>

## Architecture & Layout [#architecture--layout]

The Python stack adheres to pure Clean Architecture logic and utilizes modern Python (3.12+).

```text
services/wordloop-ml/
├── src/wordloop/
│   ├── core/domain/         # Pydantic state models (No logic leaks)
│   ├── core/gateways/       # typing.Protocol interface definitions
│   ├── core/services/       # Orchestration workflows
│   ├── entrypoints/         # FastAPI Routes, Pub/Sub Worker Consumers
│   └── providers/           # Concrete external integrations (AssemblyAI, GCP)
├── tests/                   # unit/ and system/
└── pyproject.toml           # `uv` managed dependencies
```

## Local Development Workflow [#local-development-workflow]

Our Python architecture relies entirely on `uv` for ultra-fast, predictable dependency management and virtual environments.

1. **Start Platform Dependencies**
   ```bash
   ./dev start infra core
   ```
   *(Boots Emulators, Observability dashboard, and the Core Go service)*

2. **Boot the API Server**
   ```bash
   cd services/wordloop-ml
   uv run wordloop-api
   ```

3. **Boot the Async Worker (Pub/Sub)**
   ```bash
   cd services/wordloop-ml
   uv run wordloop-worker
   ```

## Development Guidelines [#development-guidelines]

* **Pydantic Everywhere:** Use Pydantic models to strictly serialize, deserialize, and validate I/O boundaries.
* **Service Identity & Core Interaction:** When writing back to Core, ML must inject the `SERVICE_AUTH_TOKEN` generated via `./dev setup`.

<Callout type="warn">
  Never bypass interface restrictions. Always examine the [ML Architecture Rules](architecture.mdx) before injecting new dependency chains into a workflow.
</Callout>


# Data Flow (/docs/work/_template/tdd/data-flow)


{/*
  LLM CONTEXT — DATA FLOW DOC
  Bet: <bet-slug>
  Purpose: Maps every user action from the UI Design through service boundaries.
  Services: App (Next.js) | Core (Go) | ML (Python) | [add/remove as needed]
  Persistent stores: PostgreSQL (Core) | GCS | [add/remove as needed]
  Protocol inventory: REST | WebSocket | Pub/Sub | [add/remove as needed]
  */}

# Data Flow [#data-flow]

<Callout title="Scope check" type="warn">
  Diagrams use **descriptive operation labels** — not endpoint paths, header names, or field names. Those belong in the Contracts doc. If you find yourself writing `POST /meetings/:id/tasks` or `Authorization: Bearer` in a diagram label, move it there.
</Callout>

***

## System Context [#system-context]

*The topology of the system — which services exist and how they connect. This is a map, not a sequence. Draw it once, at the top, before any flows.*

<Mermaid
  chart="`flowchart LR
  App[&#x22;App\n(Next.js)&#x22;]
  Core[&#x22;Core\n(Go)&#x22;]
  ML[&#x22;ML\n(Python)&#x22;]
  DB[(&#x22;PostgreSQL\n(Core)&#x22;)]
  Store[(&#x22;Object Store\n(GCS)&#x22;)]
  Ext[&#x22;External API\n(Third-party)&#x22;]

  App <-->|&#x22;REST + WebSocket&#x22;| Core
  Core <-->|&#x22;Pub/Sub&#x22;| ML
  Core --- DB
  Core --- Store
  ML --- Ext`"
/>

*Edit this graph to match the actual services in this bet. Every node shown here should appear as a participant in at least one flow below.*

***

## Flows [#flows]

*Group flows into logical **Parts** — one Part per major phase of the user journey. Name each flow after what triggers it, not after the implementation.*

***Rule:** Labels describe the operation, never the implementation. "Create task (idempotent, echo-suppressed)" is correct. "POST /meetings/:id/tasks" belongs in Contracts.*

### Part 1 — \[Phase Name] [#part-1--phase-name]

*What the user is doing and what the system is setting up during this phase.*

#### Flow 1: \[Flow Name] [#flow-1-flow-name]

*One sentence: what the user does to trigger this, and what state the system reaches when it completes.*

<Mermaid
  chart="`sequenceDiagram
  autonumber
  participant App
  participant Core
  participant DB as PostgreSQL

  App->>Core: Initiate action (descriptive label)
  Core->>DB: Persist initial state
  Core-->>App: Acknowledge with initial data`"
/>

*Explain non-obvious sequencing decisions — why async vs sync, why this ordering constraint — in a sentence below the diagram, not as diagram annotations.*

***

#### Flow 2: \[Flow Name] [#flow-2-flow-name]

*One sentence describing trigger and outcome.*

<Mermaid
  chart="`sequenceDiagram
  autonumber
  participant App
  participant Core
  participant ML

  Core->>ML: Trigger background job (via Pub/Sub)
  ML-->>Core: Result callback`"
/>

***

### Part 2 — \[Phase Name] [#part-2--phase-name]

*What the system does continuously during this phase, and what the user sees in response.*

#### Flow 3: \[Flow Name] [#flow-3-flow-name]

*One sentence describing trigger and outcome.*

<Mermaid
  chart="`sequenceDiagram
  autonumber
  participant App
  participant Core

  App->>Core: Descriptive operation
  Core-->>App: Descriptive response`"
/>

***

### Part N — Failure Modes [#part-n--failure-modes]

*Failure flows are **required**. For every significant service boundary in this bet, there must be at least one flow showing what happens when that boundary fails and how the system recovers. Model failures that would cause data loss or silent breakage — not every possible error.*

*If the UI Design doc models a "Degraded" or "Connectivity Lost" state, the corresponding recovery sequence must appear here.*

#### Flow N: \[Failure Scenario Name] [#flow-n-failure-scenario-name]

*What failure condition triggers this, which boundaries it affects, and the recovery sequence.*

<Mermaid
  chart="`sequenceDiagram
  autonumber
  participant App
  participant Core

  Note over App,Core: [Failure] detected
  App->>Core: Recovery initiation
  Core-->>App: Recovery acknowledgement with state`"
/>

***

## Design Decisions [#design-decisions]

*Required. Record decisions that shaped the flows above. If a future engineer would ask "why did you do it this way?", it belongs here. Common categories:*

* ***Infrastructure constraints** — what the bet assumes exists (or doesn't). E.g., "sticky session affinity, no pod-to-pod backplane." If the missing infrastructure is significant, extract a separate problem statement for it.*
* ***Scope boundaries** — capabilities explicitly deferred to a future version and why. Reference the relevant No-Go in the pitch.*
* ***Performance / latency choices** — what was optimised and what was traded. E.g., "pre-warm the upstream session on permission grant, not on first data."*
* ***Lifecycle / cleanup policies** — what temporary data is created, when it's deleted, and what safety window exists.*
* ***Protocol / pattern choices** — why this protocol over alternatives. E.g., "HTTP stream not WebSocket for service-to-service, because..."*

| Decision           | Alternatives considered      | Why this                                     |
| ------------------ | ---------------------------- | -------------------------------------------- |
| *What was decided* | *What else was on the table* | *The constraint or tradeoff that settled it* |

***

## Boundary Inventory [#boundary-inventory]

*Every service-to-service boundary shown in the flows above. This table feeds directly into the Contracts doc — each row here becomes a contract entry.*

| Boundary                         | Flows    | From → To           | Protocol              | Data shape                                                   |
| -------------------------------- | -------- | ------------------- | --------------------- | ------------------------------------------------------------ |
| *Descriptive name for this call* | *Flow N* | *Service → Service* | *REST / WS / Pub/Sub* | *What information crosses: operation, key fields, direction* |


# Overview (/docs/work/_template/tdd)


# Technical Design Document [#technical-design-document]

> **Status**: Draft | Agreed
> **Author*&#x2A;: *@handle*
> **Date**: *YYYY-MM-DD*

## Success Criteria [#success-criteria]

*What does "solved" look like? Define measurable criteria from the user's perspective and the system's perspective. The problem context lives in the [Pitch](../pitch) — don't restate it here.*

| Criterion              | Measured by            |
| ---------------------- | ---------------------- |
| *User-visible outcome* | *How you'll verify it* |

## Architectural Approach [#architectural-approach]

*The approach taken and the key decisions made. What options were considered? Which was chosen and why? Not implementation detail — the reasoning. Link to the Design Decisions table in Data Flow for the full rationale.*

## Constraints [#constraints]

*Architectural constraints discovered during design, beyond the [no-gos in the Pitch](../pitch). Include links to problem statements for known limitations that are deliberately deferred.*

* *Example: \[constraint or deferred limitation]*

## Open Questions [#open-questions]

*What is still unknown, who owns the answer, and when it must be resolved. An empty table is a warning sign.*

| Question                | Owner  | Status                           |
| ----------------------- | ------ | -------------------------------- |
| *What is the question?* | *@who* | *To verify / Resolved / Blocked* |

***

## Navigation Map [#navigation-map]

*Link to each TDD sub-document with a one-line description of what it covers. Helps readers orient quickly.*

| Document                 | What it covers                                        |
| ------------------------ | ----------------------------------------------------- |
| [UI Design](ui-design)   | *Wireframes, screen states, and interaction patterns* |
| [Data Flow](data-flow)   | *Sequence diagrams and design decisions*              |
| [Contracts](contracts)   | *API shapes for every boundary*                       |
| [Schemas](schemas)       | *Database table designs*                              |
| [Milestones](milestones) | *Build plan broken into shippable slices*             |

***

## Architecture Scaffolding [#architecture-scaffolding]

To maintain structural consistency and immediate integration with the test suite, always use the CLI to generate the remaining TDD components.

**Add a Milestone**

```bash
./dev new milestone {{BET_SLUG}} <milestone-slug>
```

**Add a Domain Slice** (Automatically connects to pytest suite)

```bash
./dev new slice {{BET_SLUG}} <milestone-slug> <domain> <slice-slug>
```

**Design API Contracts**

```bash
# Scaffold the default contract tree for a service bet:
# contracts/core/{rest,websocket,pubsub}.mdx
# contracts/ml/{rest,websocket}.mdx
./dev new contracts {{BET_SLUG}}

# Scaffold one additional boundary when the Boundary Inventory calls for it:
./dev new contract {{BET_SLUG}} <service> <protocol>
```

Contract docs live under `tdd/contracts/<service>/<protocol>.mdx`. The folder describes the API boundary between services in the ideal end state: REST resources and commands, WebSocket streams and events, Pub/Sub topics and event envelopes. Use the protocol-specific templates to start from the right checklist instead of a blank page.

**Design a Database Schema**

```bash
./dev new schema {{BET_SLUG}} <service> <database-tech>
```


# UI Design (/docs/work/_template/tdd/ui-design)


# UI Design [#ui-design]

*This document describes what the user sees and does — not how the system delivers it. Walk through each screen the bet touches, then map the journeys between them. The goal: enough concrete detail that a data flow, API contracts, and database schema can be designed from this document alone.*

***Scope check:** If you're specifying which service owns the logic, how the frontend integrates, or where data persists — you've gone too far. That belongs in the Data Flow doc.*

***

## 1. Screens [#1-screens]

*For each screen, follow this structure: a brief description, a wireframe, the layout, the states it can be in, and the key interactions. One subsection per screen.*

### Screen Name [#screen-name]

*One sentence: what is the user trying to do on this screen?*

*Include a wireframe — even a rough sketch. The wireframe is the anchor; the text describes it.*

{/* ![Wireframe](/images/bets/bet-slug/screen_name.png) */}

**Layout:**
&#x2A;Describe the layout regions and what content lives in each one. Be specific about what fields and controls exist.*

* **Region name:** &#x2A;What's here. If it's an input, say what kind (free text, dropdown, rich text). If it auto-saves, say so. If there's a component being reused, name it.*

**States:**

| State                 | What the user sees                                                              |
| --------------------- | ------------------------------------------------------------------------------- |
| *Loading / skeleton*  | *What's visible while data arrives? Grey blocks? Spinner? Disabled controls?*   |
| *Active / happy path* | *Everything working normally.*                                                  |
| *Empty*               | *No data yet — what does the user see? A prompt? A placeholder?*                |
| *Error / degraded*    | *Something went wrong — what's visible, what still works, how does it recover?* |

*Think about: What does the user see in the first second? The first 10 seconds? After an hour?*

***For any feature that depends on a live connection or external service:** each failure mode gets its own named row — not just "Error / degraded". A live recording screen, for example, needs separate rows for Connecting, Degraded (ML down but audio continues), Connectivity Lost, and Reconnected. If the Data Flow doc will have a failure mode flow for it, this table needs a row for it. The two docs must stay in sync.*

**Key Interactions:**
&#x2A;What can the user do on this screen? For each interaction, describe what happens in response. Stay at the user level — "the task appears in the list" not "the API returns 201".*

* **Interaction name:** &#x2A;What the user does → what they see in response.*

*Be specific about data objects. If the screen shows tasks, define what a task is: what fields does it have? Which are required? Can they be nested? What states can they be in? Does editing change their classification? Name the component if reusing one.*

***

### Another Screen [#another-screen]

*Repeat the same structure. Include screens for both the primary flow and any secondary views (tabs, modals, expanded states).*

***

## 2. User Journeys [#2-user-journeys]

*Map how the user moves between screens. One journey per major flow. Use simple ASCII diagrams — they're scannable and diffable. Include both the happy path and the key branches (permission denied, error recovery, etc.).*

### Primary Journey [#primary-journey]

```text
[Starting point] → [Screen A]
       │
       ▼
  [Decision point] ──failure──→ [Error state / blocking modal]
       │ success
       ▼
  [Screen B]
       │
       │  ← what the user does here, what the system shows them
       │
       ▼
  [Screen C]
```

### Secondary Journey [#secondary-journey]

```text
  [Screen C]
       │
       │  ← review, edit, explore
       │
       ├──→ [Screen D]
       │      └──→ back to [Screen C]
```

***

## 3. Edge Cases [#3-edge-cases]

*Anything that doesn't fit neatly into a screen's state table. Focus on user-visible behaviour, not system internals.*

*Prompt yourself with these categories — not every bet will hit all of them, but each is worth considering:*

* ***Concurrent access** — same user in multiple tabs, same resource accessed by multiple users*
* ***Session boundaries** — what happens on tab close, browser crash, token expiry, long idle periods?*
* ***Resource limits** — very long sessions, very large data sets, quota exhaustion*
* ***Background/foreground** — what happens when the tab is backgrounded or the device sleeps?*

| Scenario                        | Behaviour                                |
| ------------------------------- | ---------------------------------------- |
| *What goes wrong or is unusual* | *What the user sees and can do about it* |


# Problem Statement (/docs/work/delivered/live-capture/01-problem-statement)


{/* LLM-Context: TL;DR:
  Problem Statement for the Meeting Recording bet.
  Problem: No live capture flow; meetings enter only via file upload or manual text entry.
  Appetite: Large (the most complex bet the platform has run so far).
  Why now: Live recording is the missing foundation for real-time AI value.
  */}

# Problem Statement [#problem-statement]

> **Status**: Accepted
>
> **Author**: Ryan Nel
>
> **Date**: 2026-04-18

***

## Observed Problem [#observed-problem]

Users need a way to capture meetings — both in-person and virtual — directly from the WordLoop app without relying on third-party recording tools or manual note-taking. Today, meetings can only enter the system via file upload or manual text entry — both of which are post-hoc and require the meeting to have already happened. There is no live capture flow.

The core pain points are:

1. **No live capture** — users must remember to take notes or record externally, then import later.
2. **No real-time feedback** — users have no visibility into what's being captured until processing completes.
3. **Post-processing delay** — insights (talking points, tasks, topics) are only available after a batch transcription and synthesis pipeline finishes.

***

## Appetite [#appetite]

**Large.** This is the most complex bet the platform has run so far. It introduces a binary audio streaming path, a bidirectional HTTP stream between Core and ML, a new recording lifecycle, speaker identification, post-meeting reprocessing, and audio playback — all coordinated across three services.

The appetite is deliberately accepted before scoping begins. If the full scope doesn't fit, we cut scope — we don't extend the appetite.

***

## Why Now [#why-now]

Live recording is the missing foundation for real-time AI value. File upload works but it delays every insight. The ML infrastructure (AssemblyAI, speaker diarisation, streaming insights) is already in place — this bet wires it to a live capture path. Without live recording, WordLoop remains a post-hoc tool. With it, it becomes a meeting partner.

***

## Output [#output]

Check [Bet Sizing](../../sizing) to confirm the appetite judgment is realistic. Then move to [The Pitch](pitch).


# Pitch (/docs/work/delivered/live-capture/02-pitch)


{/* LLM-Context: TL;DR:
  Pitch for the Meeting Recording bet.
  Core sketch: extend the WebSocket connection for binary audio, stream to GCS and ML in parallel,
  route ML insights back to the client via the same connection, trigger post-meeting reprocessing via Pub/Sub.
  Key rabbit holes: audio encoding choice, degraded ML handling, speaker ID confidence threshold.
  */}

# The Pitch [#the-pitch]

> **Status**: Accepted
>
> **Author**: Ryan Nel

***

## Problem [#problem]

See [Problem Statement](01-problem-statement): no live capture flow — meetings enter only via file upload or manual text entry. Appetite: Large.

***

## Solution Sketch [#solution-sketch]

Extend the existing WebSocket connection to carry binary audio frames upstream. Core receives audio and fans out in parallel: stream to GCS for durable storage, and stream to ML for live transcription. ML routes segments and insights back to Core via a persistent HTTP stream — Core broadcasts them to the client via WebSocket and persists them asynchronously.

When recording stops, Core publishes a `MeetingSessionTerminated` event. ML drains its AssemblyAI buffer and sends final segments. Core then triggers a post-meeting reprocessing job via Pub/Sub — the same pipeline used for file uploads, with `skip_tasks: true` to preserve tasks captured live.

**This bet does not change the data architecture pattern.** It extends Optimistic Mutation with Echo-Suppressed Streaming to cover:

* A new upstream path (audio frames)
* A new downstream path (transcript segments and ML insights in real time)

***

## Rabbit Holes [#rabbit-holes]

**Audio encoding.** The client captures audio in the browser. The ML service expects a specific format (PCM16 or WebM). Encoding decisions affect latency. We keep this simple: the client sends raw WebM chunks; Core forwards them as-is. No transcoding at Core.

**ML degradation.** If AssemblyAI is unavailable mid-recording, we cannot fail the session — the user is speaking. The bet requires graceful degradation: continue storing audio to GCS, show a warning, and recover via post-processing when services come back.

**Speaker identification confidence threshold.** Matching a voice embedding against known profiles requires a threshold. Too low: false matches. Too high: no matches. The threshold must be configurable without a deployment. Start with 0.85 and expose it as a server-side config value.

**Session state.** One active session per user. Core must enforce this — two concurrent recording sessions from the same user is an error, not a queuing scenario.

***

## No-Gos [#no-gos]

* No calendar integration — auto-starting from calendar events is a separate bet
* No multi-user collaborative recording
* No video capture — audio only
* No meeting bot integration (Zoom/Teams/Meet)
* No custom vocabulary or domain-specific tuning; use default AssemblyAI settings

***

## Output [#output]

Pitch is accepted. Move to [Design](design) to map the user journey and define what the UI needs before the API is designed.


# Data Flow (/docs/work/meeting-recording/tdd/data-flow)


{/* LLM-Context: TL;DR:
  Data flow for the Meeting Recording bet.
  20 flows across 3 sections:
  — Part 1 (Live Session): Start Recording (incl. concurrent session guard),
  Live Audio→Transcription (always-on OPFS shadow buffer + chunk sequencing),
  Live Insights Pipeline (3a batched talking points + tasks via single LLM structured
  output with existing task list for LLM-native deduplication,
  3c progressive speaker ID with lock-on), Degraded Mode (5 failure domains
  incl. GCS failure), Audio Gap Recovery, Audio Silence Detection,
  Duration Warning & Auto-Stop, Server-Side Inactivity Timeout (5 min default),
  Background Tab Audio Continuity (Web Workers).
  — Part 2 (User Mutations): Notes Auto-Save,
  User Creates Task, Task Mutations (full CRUD), User Labels Speaker,
  Create New Person During Speaker Labelling.
  — Part 3 (Session End & Post-Meeting): Stop Recording (4-phase: drain ML →
  collect OPFS gaps → compose GCS → publish TranscriptionJob),
  Batched Gap Upload (50 chunks per multipart request),
  Post-Meeting Processing, Soft-Deleted Meeting Handling (204 on write-back),
  Transcription Processing Lifecycle, Audio Playback (signed URL + readiness).
  Each arrow = contract boundary. All contracts formalised on the Contracts page.
  */}

# Data Flow [#data-flow]

This document sits between [UI Design](ui-design) (which defines what the user sees) and [Contracts](contracts) (which formalise the API shapes). For each screen and interaction, it draws what calls what: which service initiates, which responds, what data crosses each boundary. Read each arrow two ways: it is a **contract boundary** (what shape the data takes) and a **sequencing constraint** (downstream cannot build until the upstream contract is published).

## System Context [#system-context]

<Mermaid
  chart="`graph TB
  USER[&#x22;👤 User&#x22;] --> APP[&#x22;wordloop-app<br/>(Next.js)&#x22;]
  APP -- &#x22;REST (mutations)&#x22; --> CORE[&#x22;wordloop-core<br/>(Go)&#x22;]
  APP -. &#x22;WebSocket<br/>(events + audio)&#x22; .-> CORE
  CORE --> DB[(Postgres)]
  CORE --> GCS[(&#x22;☁️ Cloud Storage<br/>(audio files)&#x22;)]
  CORE -. &#x22;WebSocket<br/>(audio + segments)&#x22; .-> ML[&#x22;wordloop-ml<br/>(Python)&#x22;]
  ML -. &#x22;WebSocket<br/>(segments + insights)&#x22; .-> CORE
  ML --> AAI[&#x22;🎙️ AssemblyAI<br/>(transcription)&#x22;]
  ML --> LLM[&#x22;🤖 OpenAI<br/>(insights)&#x22;]
  CORE -. &#x22;Pub/Sub<br/>(async jobs)&#x22; .-> ML`"
/>

***

## Part 1: Live Session [#part-1-live-session]

Flows that run automatically during an active recording session — audio capture, ML insights, and system resilience.

### Flow 1: Start Recording [#flow-1-start-recording]

The user opens the &#x2A;*New Meeting ▾** dropdown and selects **Start Live Recording**. After the browser grants microphone access, the app creates a meeting and initiates the recording session across all three services.

**Pre-conditions:** The client checks for an active recording session before enabling the button. If one exists, "Start Live Recording" is disabled with a tooltip. Core enforces this server-side — if `StartRecordingCommand` arrives while a session is already running, it responds with `RecordingErrorEvent (session_conflict)`. If the browser denies microphone access, the app shows a blocking modal with a link to browser audio settings — no data flow occurs.

<Mermaid
  chart="`sequenceDiagram
  autonumber
  participant App
  participant Core
  participant DB
  participant GCS
  participant ML
  participant AAI as AssemblyAI

  App->>Core: Create meeting (live recording)
  Core->>DB: INSERT meeting
  Core-->>App: 201 Meeting
  Note over App: UI shows new meeting

  App->>Core: WS: StartRecordingCommand
  Core->>ML: POST /meetings/{id}/live-session (speaker states + voice profiles pushed)
  ML->>ML: Reconstruct speaker_label → state map from pushed data
  ML->>AAI: Open real-time transcription session
  ML-->>Core: 201 (websocket_url)
  Core->>ML: Open ML WebSocket (StreamStartEvent with speaker states)
  ML-->>Core: StreamReadyEvent
  Core-->>App: WS: RecordingStartedEvent
  Note over App: Banner transitions to &#x22;● Recording 00:00&#x22;`"
/>

### Flow 2: Live Audio → Transcription (Lowest Latency Path) [#flow-2-live-audio--transcription-lowest-latency-path]

Audio flows from the browser microphone through Core and ML to AssemblyAI. Transcript segments return via the **ML WebSocket** — the same long-lived connection Core uses to send audio. Audio chunks flow upstream as binary frames, segments and insights flow downstream as CloudEvents text frames.

**OPFS shadow buffer:** Every audio chunk is simultaneously written to an always-on shadow buffer maintained by a dedicated Web Worker using the Origin Private File System (OPFS) `createSyncAccessHandle()` API. Each chunk carries a monotonically incrementing sequence number assigned in the browser. This buffer runs unconditionally — it captures audio regardless of Core or GCS connectivity. It is cleared only after Core confirms all chunks are safely in GCS (see Flow 16 and Flow 9).

**Chunk-based GCS writes:** Instead of a single streaming write to one file, each audio chunk is stored as a separate GCS object keyed by sequence number: `meetings/{id}/chunks/{seq:08d}.webm`. WebM encodes its EBML header in the first chunk; subsequent chunks contain raw Cluster data. This structure enables gap recovery — any chunk missed due to a connectivity failure can be backfilled from OPFS by sequence number. At session end, Core composes the chunk objects into the final `audio.webm` (see Flow 9).

Core streams segments directly to the client via WebSocket for minimum latency, and persists them to the database asynchronously in the background. The app distinguishes **interim** segments from **final** segments and replaces them in-place when the final version arrives — no layout shift.

<Mermaid
  chart="`sequenceDiagram
  autonumber
  participant App
  participant Worker as OPFS Worker
  participant Core
  participant DB
  participant GCS
  participant ML
  participant AAI as AssemblyAI

  loop Every ~100ms audio chunk
      App->>App: Assign sequence number (monotonic)
      App-->>Worker: Chunk + seq (async postMessage)
      Note over Worker: Write to OPFS shadow buffer (always-on)
      App->>Core: WS: Binary audio frame (chunk + seq)
      par Store as GCS chunk
          Core->>GCS: PUT meetings/{id}/chunks/{seq:08d}.webm
      and Forward to ML
          Core->>ML: Audio chunk (via ML WebSocket binary frame)
          ML->>AAI: Forward audio chunk
      end
  end

  Note over AAI: Transcription produced

  AAI-->>ML: Transcript segment (interim or final)
  ML-->>Core: WS: TranscriptSegmentProducedEvent

  par Lowest latency: stream to client
      Core-->>App: WS: TranscriptSegmentEvent
      Note over App: Render segment immediately
  and Persist asynchronously
      Core->>Core: Enqueue async DB write
      Core->>DB: INSERT transcript_segment (background)
  end`"
/>

### Flow 3: Live Insights Pipeline [#flow-3-live-insights-pipeline]

Talking points and tasks are extracted by the same LLM query, batched together as a single structured output call. This avoids redundant token spending — the transcript context is loaded once into the prompt cache and both extraction tasks run against it. All insights stream back through the ML WebSocket as CloudEvents text frames, following the dual-write pattern: Core fans out to the browser via WebSocket for latency, and persists to DB asynchronously for durability.

**Context management:** ML maintains a rolling transcript buffer in memory, appending each finalised segment as it arrives. The full buffer is included in the LLM prompt as cached context. The prompt is always ordered as: `[system instructions] [schema] [anchor segments] [recent window] [latest segment]`. When the context budget is exceeded, segments are dropped from the **oldest non-anchored position** — the boundary between the anchor and the recent window — never from the front. Removing from the front would change the content immediately after the static cached prefix, invalidating every transcript token in the cache. Because the anchor only ever grows, the cached region (`[system] + [schema] + [anchor]`) expands over the course of the meeting and is never invalidated by trimming.

#### Flow 3a: Live Talking Points & Tasks (Batched — Per Finalised Segment) [#flow-3a-live-talking-points--tasks-batched--per-finalised-segment]

On every finalised transcript segment, ML sends the full rolling transcript buffer to the LLM requesting both the latest talking point and any new tasks in a single structured output call. The LLM returns both in one response. Talking points update immediately, and tasks are extracted opportunistically from the same call.

**LLM-native task deduplication:** The current list of extracted tasks is appended to the dynamic suffix of each prompt. The LLM is instructed to return only tasks that are genuinely new — not already represented in the existing list. This delegates deduplication to the model, which handles paraphrase and semantic overlap naturally without a separate post-processing step.

The prompt is structured for OpenAI prompt caching: `[system instructions] [output schema] [anchor segments]` forms the stable cached prefix that grows as the session progresses. `[recent window] [latest segment] [existing task list]` is the dynamic suffix appended on each call. Placing the task list in the dynamic suffix (not the cached prefix) keeps the cache hit rate high — the stable cached region is never invalidated by task accumulation.

<Mermaid
  chart="`sequenceDiagram
  autonumber
  participant App
  participant Core
  participant DB
  participant ML
  participant LLM as OpenAI

  Note over ML: Final segment received
  ML->>ML: Append segment to rolling transcript buffer

  ML->>LLM: Structured output query (cached prefix + transcript buffer + existing tasks)
  LLM-->>ML: Structured response — talking point and new tasks only

  opt Talking point returned
      ML-->>Core: WS: TalkingPointProducedEvent (draft)
      par Stream to client
          Core-->>App: WS: EntityChanged (talking_point)
          Note over App: SWR revalidates, shows draft talking point
      and Persist asynchronously
          Core->>Core: Enqueue async DB write
          Core->>DB: UPSERT talking_point (draft)
      end
  end

  opt New tasks detected
      loop For each new task
          ML-->>Core: WS: TaskProducedEvent (system-generated)
          par Stream to client
              Core-->>App: WS: EntityChanged (task)
              Note over App: SWR revalidates, shows new task
          and Persist asynchronously
              Core->>Core: Enqueue async DB write
              Core->>DB: INSERT task (background)
          end
      end
  end`"
/>

#### Flow 3b: Live Speaker Identification (Per Diarised Speaker) [#flow-3b-live-speaker-identification-per-diarised-speaker]

AssemblyAI's transcript segments arrive pre-diarised — each segment carries a `speaker_label` (e.g. `speaker_1`, `speaker_2`). ML's job is to resolve each `speaker_label` to a known Person by matching voice embeddings against enrolled profiles.

**Every segment gets a voice embedding.** Regardless of whether the speaker has been identified, ML extracts a voice embedding from the segment's audio and stores it on the segment. This happens unconditionally — embeddings are required for post-meeting RAG and future retrieval, not only for speaker matching.

**Matching strategy:** Speaker matching runs separately, gated on the per-session map `speaker_label → { status, person_id?, attempts }`. This map lives in ML's memory for the hot path but is mirrored to a `meeting_speaker_states` table in Postgres on meaningful transitions. On session start — and on reconnect after a pod restart — Core pushes the current speaker states and voice profiles to ML via `StreamStartEvent`, so ML reconstructs its in-memory map without needing a pull endpoint.

| State       | Behaviour                                                                                                                                                                                                                   | Persisted?                                                                                                   |
| ----------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------ |
| `unmatched` | Compare this segment's embedding against all enrolled voice profiles. If confidence exceeds the match threshold → transition to `matched`. Otherwise, increment `attempts` and retry on the next segment from this speaker. | Attempts tracked in-memory only — an unmatched speaker restarting at 0 on recovery is acceptable.            |
| `matched`   | The speaker label is locked to a person. All future segments from this speaker are tagged immediately — no further voice comparison needed.                                                                                 | Yes — persisted to `meeting_speaker_states` (status + person reference) on transition.                       |
| `exhausted` | After N failed attempts (configurable, e.g. 5 segments), stop comparing for this speaker. The raw `speaker_label` is preserved. The user can manually resolve it via Flow 7 (speaker labelling).                            | Yes — persisted to `meeting_speaker_states` on transition.                                                   |
| `manual`    | Set by Flow 7 when the user labels a speaker. Takes precedence over voice matching — ML will not attempt to match this speaker regardless of voice similarity.                                                              | Yes — written synchronously by Core (Flows 7/8) so it is immediately visible on any subsequent pod recovery. |

<Mermaid
  chart="`sequenceDiagram
  autonumber
  participant ML
  participant Core
  participant DB

  Note over ML: Segment received (speaker_label: speaker_1)

  ML->>ML: Extract voice embedding from segment audio
  ML-->>Core: WS: SegmentFeaturesProducedEvent (segment reference + embedding)
  Core->>DB: UPDATE transcript_segment (voice embedding)

  alt speaker_1 already matched
      ML-->>Core: WS: TranscriptSegmentProducedEvent (person resolved)
      Core->>DB: UPDATE transcript_segment (person)
      Core-->>Core: WS: EntityChanged (transcript_segment)
  else speaker_1 unmatched (attempts < max)
      ML->>ML: Compare embedding against in-session voice profiles (pushed by Core)

      alt Confident match (score > threshold)
          ML->>ML: Lock speaker_1 → person (session cache)
          ML-->>Core: WS: SpeakerMatchProducedEvent (speaker label resolved to person)
          Core->>DB: UPDATE transcript_segments (person, by speaker label)
          Core->>DB: UPSERT meeting_speaker_states (matched, async)
          Core-->>Core: WS: EntityChanged (transcript_segment)
          Note over Core: All past + future segments for speaker_1 show person name
      else No confident match
          ML->>ML: Increment attempts for speaker_1
          Note over ML: Keep raw speaker_label, retry on next segment
      end
  else speaker_1 exhausted (attempts >= max)
      ML-->>Core: WS: SpeakerExhaustedEvent
      Core->>DB: UPSERT meeting_speaker_states (exhausted, async)
      Note over ML: Skip comparison — speaker unknown, raw label preserved
  end`"
/>

***

## Part 2: User Mutations [#part-2-user-mutations]

Flows initiated by the user during or after a recording session. All follow the **Optimistic Mutation with Echo-Suppressed Streaming** pattern: the client updates local state immediately, sends the mutation via REST, and suppresses the returning WebSocket echo.

### Flow 4: Notes Auto-Save [#flow-4-notes-auto-save]

The **Private Notes** scratchpad is the primary surface of the live recording view — it occupies the left column. Notes auto-save continuously with no explicit save button. The app debounces keystrokes and patches the meeting's notes field.

<Mermaid
  chart="`sequenceDiagram
  autonumber
  participant App
  participant Core
  participant DB

  Note over App: User types in notes scratchpad
  App->>App: Debounce (500ms since last keystroke)
  App->>Core: Save notes (echo-suppressed)
  Core->>DB: UPDATE meeting (notes)
  Core-->>App: 200
  Core-->>App: WS: EntityChanged (meeting)
  Note over App: Echo suppressed (sourceClientId matches)<br/>Notes remain editable post-session`"
/>

### Flow 5: User Creates Task [#flow-5-user-creates-task]

The user's task is written via REST (not the streaming path) since it's a user-initiated mutation. Tasks have a description (required), assignee (optional), and due date (optional).

<Mermaid
  chart="`sequenceDiagram
  autonumber
  participant App
  participant Core
  participant DB

  App->>App: Optimistic: add task to local state
  App->>Core: Create task (idempotent, echo-suppressed)
  Core->>DB: INSERT task
  Core-->>App: 201 Task
  Note over App: Replace optimistic with server entity
  Core-->>App: WS: EntityChanged (task)
  Note over App: Echo suppressed (sourceClientId matches)`"
/>

### Flow 6: Task Mutations (Full CRUD) [#flow-6-task-mutations-full-crud]

Flow 5 covers task creation. The UI design specifies a full set of task mutations: edit, delete, toggle completion, nest under other tasks, assign a person, and set a due date. Editing a system-generated task converts it to user-owned.

<Mermaid
  chart="`sequenceDiagram
  autonumber
  participant App
  participant Core
  participant DB

  Note over App: === Update Task (edit, assign, due date, toggle completion) ===
  App->>App: Optimistic: update task in local state
  App->>Core: Update task (echo-suppressed)
  Note over Core: System-generated task edited by user → promoted to user-owned
  Core->>DB: UPDATE task
  Core-->>App: 200 Task
  Core-->>App: WS: EntityChanged (task)
  Note over App: Echo suppressed

  Note over App: === Delete Task ===
  App->>App: Optimistic: remove task from local state
  App->>Core: Delete task (echo-suppressed)
  Core->>DB: DELETE task (cascades to sub-tasks)
  Core-->>App: 204
  Core-->>App: WS: EntityChanged (task, deleted)
  Note over App: Echo suppressed

  Note over App: === Create Sub-Task (nesting) ===
  App->>App: Optimistic: add sub-task under parent
  App->>Core: Create sub-task (idempotent, echo-suppressed)
  Core->>DB: INSERT sub_task
  Core-->>App: 201 SubTask
  Core-->>App: WS: EntityChanged (task)
  Note over App: Echo suppressed`"
/>

### Flow 7: User Labels Speaker as Person [#flow-7-user-labels-speaker-as-person]

When a user identifies "Speaker A" as a known Person (by clicking the speaker label on any transcript segment), the system reassigns all segments from that speaker and records the mapping in `meeting_speaker_states` as a manual override so that ML respects it immediately on any pod recovery. Voice profile enrichment from the session's embeddings happens during post-meeting processing, not here.

<Mermaid
  chart="`sequenceDiagram
  autonumber
  participant App
  participant Core
  participant DB

  App->>Core: Assign speaker to person (meeting-scoped)
  Core->>DB: UPDATE transcript_segments (person, by speaker label)
  Core->>DB: UPSERT meeting_speaker_states (manual, with person reference)
  Core-->>App: 200
  Core-->>App: WS: EntityChanged (transcript_segment)
  Note over App: All segments now show person name`"
/>

### Flow 8: Create New Person During Speaker Labelling [#flow-8-create-new-person-during-speaker-labelling]

Flow 7 assumes the person already exists. The UI design says the user can "reassign to a known person or **add a new one**." When creating a new person, the UI handles this as two sequential operations: first create the person, then assign them to the speaker label using the same endpoint as Flow 7. The speaker-labels endpoint always receives an existing person reference — it has no knowledge of whether that person was just created or long-established. Voice profile enrichment happens during post-meeting processing.

<Mermaid
  chart="`sequenceDiagram
  autonumber
  participant App
  participant Core
  participant DB

  Note over App: User clicks speaker label → selects &#x22;Add new person&#x22;

  Note over App: Step 1 — Create the person
  App->>Core: Create person
  Core->>DB: INSERT person
  Core-->>App: 201 Person
  Core-->>App: WS: EntityChanged (person, created)

  Note over App: Step 2 — Assign to speaker label (same as Flow 7)
  App->>Core: Assign speaker to person (same as Flow 7)
  Core->>DB: UPDATE transcript_segments (person, by speaker label)
  Core->>DB: UPSERT meeting_speaker_states (manual, with person reference)
  Core-->>App: 200
  Core-->>App: WS: EntityChanged (transcript_segment)
  Note over App: All segments show new person name`"
/>

***

## Part 3: Session End & Post-Meeting [#part-3-session-end--post-meeting]

Flows triggered when a recording stops (user-initiated or auto) and the subsequent background processing that upgrades all artefacts to final quality.

### Flow 9: Stop Recording [#flow-9-stop-recording]

The user presses **Stop Recording** (or the system auto-stops at the duration limit). The stop sequence is strictly ordered: drain ML first (to flush final transcript segments), then collect any remaining OPFS gaps, then compose the final audio file, then trigger post-meeting processing. This ordering ensures no audio is lost and the composed file includes all chunks.

<Mermaid
  chart="`sequenceDiagram
  autonumber
  participant App
  participant Core
  participant DB
  participant GCS
  participant ML
  participant AAI as AssemblyAI

  App->>Core: WS: StopRecordingCommand

  Note over Core: Phase 1 — Drain ML
  Core->>ML: WS: DrainCommand
  ML->>AAI: Close transcription session (drain final segments)
  AAI-->>ML: Final transcript segments
  ML->>Core: WS: Final TranscriptSegmentProducedEvents
  ML->>Core: WS: StreamDrainedEvent
  Core->>DB: INSERT final segments
  Core-->>App: WS: EntityChanged (transcript_segment)

  Note over Core: Phase 2 — Notify client and collect OPFS gaps
  Core-->>App: WS: RecordingStoppedEvent (last_received_seq)
  Note over App: UI transitions to Meeting Summary (Overview tab)
  Note over App: Background: compare last_received_seq with OPFS buffer

  opt OPFS contains chunks not yet in GCS
      App->>Core: Upload gap chunks from OPFS (batched, resumable)
      Core->>GCS: Store gap chunks (by sequence number)
      Core-->>App: 200 (all gaps received)
  end

  Note over Core: Phase 3 — Compose final audio
  Core->>GCS: GCS Compose chunks/{seq:08d}.webm → audio.webm
  Note over Core: Hierarchical compose — groups of ≤32 per GCS operation

  Note over Core: Phase 4 — Trigger post-meeting processing
  Core->>Core: Publish TranscriptionJob via outbox (skip tasks — live)
  Note over Core: Same job as file upload, but tasks are skipped`"
/>

### Flow 10: Duration Warning & Auto-Stop [#flow-10-duration-warning--auto-stop]

A configurable maximum recording duration (default: 4 hours) is enforced server-side. Core sends a warning at T-10 minutes and auto-stops at the limit. The auto-stop triggers the same post-meeting pipeline as a user-initiated stop.

<Mermaid
  chart="`sequenceDiagram
  autonumber
  participant Core
  participant App
  participant GCS
  participant PubSub

  Note over Core: Recording timer reaches (max - 10 min)
  Core-->>App: WS: RecordingDurationWarningEvent (remaining time)
  Note over App: Non-blocking banner: &#x22;Recording will automatically stop in 10 minutes.&#x22;

  Note over Core: Recording timer reaches max duration
  Note over Core: Same 4-phase stop sequence as Flow 9
  Core->>ML: WS: DrainCommand
  Note over Core: Phase 1: Drain → Phase 2: OPFS gaps → Phase 3: Compose → Phase 4: TranscriptionJob
  Core-->>App: WS: RecordingStoppedEvent (duration limit, last_received_seq)
  Note over App: UI transitions to Meeting Summary (Overview tab)`"
/>

### Flow 11: Post-Meeting Processing (Automatic, via Pub/Sub) [#flow-11-post-meeting-processing-automatic-via-pubsub]

Post-meeting processing runs automatically via the shared `TranscriptionJob` Pub/Sub worker. For live recordings, the job is published flagging tasks to be skipped, preserving tasks captured during the session.

The worker:

1. Batch-transcribes the full audio from GCS (higher accuracy)
2. Replaces transcript segments with the improved results
3. Generates headline, summary, topics, and finalises talking points
4. Extracts tasks (file upload flow only)

The Meeting Summary page shows a subtle progress indicator during re-processing and updates each artefact in real time as it completes.

<Mermaid
  chart="`sequenceDiagram
  autonumber
  participant PubSub
  participant ML
  participant GCS
  participant AAI as AssemblyAI
  participant LLM as OpenAI
  participant Core
  participant DB
  participant App

  PubSub-->>ML: Consume TranscriptionJob

  Note over ML: audio.webm assembled from chunk sequence in Flow 9
  ML->>GCS: Download audio.webm (fully assembled)
  ML->>AAI: Batch transcribe (offline, higher accuracy)
  AAI-->>ML: Full transcript (final segments with speaker labels)

  ML->>Core: Replace all transcript segments
  Core->>DB: Replace transcript segments
  Core-->>App: WS: EntityChanged (transcript_segment)

  ML->>LLM: Generate headline
  ML->>Core: Set meeting headline
  Core->>DB: UPDATE meeting (headline)
  Core-->>App: WS: EntityChanged (meeting)

  ML->>LLM: Generate summary + topics + finalise talking points
  ML->>Core: Set synthesis artefacts (summary, topics, talking points)
  Core->>DB: UPSERT synthesis
  Core-->>App: WS: EntityChanged (meeting)

  alt Task extraction enabled (file upload flow)
      ML->>LLM: Extract tasks
      ML->>Core: Create system-generated tasks
      Core->>DB: INSERT tasks
      Core-->>App: WS: EntityChanged (task, created)
  else Task extraction skipped (live recording flow)
      Note over ML: Tasks preserved from live session
  end

  Note over App: All insights now final quality`"
/>

### Flow 12: Transcription Processing Lifecycle [#flow-12-transcription-processing-lifecycle]

The `transcriptions` table tracks processing status through a defined state machine: `pending` → `transcribing` → `synthesizing` → `completed` (or `failed`). The client uses this to show re-processing progress on the Meeting Summary page.

<Mermaid
  chart="`sequenceDiagram
  autonumber
  participant ML
  participant Core
  participant DB
  participant App

  Note over ML: Post-meeting job begins

  ML->>Core: Update transcription status → transcribing
  Core->>DB: UPDATE transcription (status)
  Core->>DB: INSERT transcription_status_history
  Core-->>App: WS: EntityChanged (transcription)
  Note over App: Progress indicator: &#x22;Re-processing transcript…&#x22;

  Note over ML: Synthesis stage (headline, summary, topics, talking points)
  ML->>Core: Update transcription status → synthesizing
  Core->>DB: UPDATE transcription (status)
  Core-->>App: WS: EntityChanged (transcription)

  ML->>Core: Update transcription status → completed
  Core->>DB: UPDATE transcription (status)
  Core-->>App: WS: EntityChanged (transcription)
  Note over App: Progress indicator removed. All artefacts final.`"
/>

### Flow 13: Audio Playback (Signed URL Direct to GCS) [#flow-13-audio-playback-signed-url-direct-to-gcs]

Core generates a short-lived signed URL. The client streams audio directly from Cloud Storage, with standard HTTP range requests for seeking. The audio player appears on the **Transcript tab** of the Meeting Summary page. If the audio file is still being processed, the endpoint returns `404` and the client retries with exponential backoff.

<Mermaid
  chart="`sequenceDiagram
  autonumber
  participant App
  participant Core
  participant GCS

  App->>Core: Request audio URL
  alt Audio not yet available (still processing)
      Core-->>App: 404
      Note over App: Player disabled: &#x22;Audio is still being processed.&#x22;
      App->>App: Retry with exponential backoff (5s, 10s, 20s…)
  else Audio ready
      Core->>GCS: Generate signed URL (expiry: 1 hour)
      Core-->>App: 200 (signed URL + expiry)
      Note over App: Player enabled — full controls active
  end

  App->>GCS: Stream audio (Range header, signed URL)
  GCS-->>App: 206 Partial Content
  Note over App: HTML5 Audio element plays with seeking

  App->>App: On timeupdate: highlight segment matching current playback position
  App->>App: On segment click: seek audio to segment start

  Note over App: === Signed URL expiry handling ===
  App->>App: Timer fires before signed URL expires
  App->>Core: Request audio URL
  Core-->>App: 200 (fresh signed URL)
  Note over App: Seamless URL rotation — no playback interruption`"
/>

### Flow 14: Degraded Mode — Layered Resilience [#flow-14-degraded-mode--layered-resilience]

The system has five independent failure domains. Each degrades gracefully — the OPFS shadow buffer ensures audio is never lost regardless of what fails on the backend. Core detects failures and notifies the client via `RecordingErrorEvent` with a specific error code. Recovery is automatic: when the broken link restores, Core sends a recovery signal and the client clears the warning. Gaps in GCS are filled via the gap upload sequence (Flow 16).

| Failure                    | What breaks                                      | What still works                                                  | Error code            |
| -------------------------- | ------------------------------------------------ | ----------------------------------------------------------------- | --------------------- |
| App → Core (WS drops)      | All commands, events, audio streaming to Core    | OPFS shadow buffer captures all audio locally                     | Client-side `onclose` |
| Core → GCS (storage fails) | Chunk writes — audio gap accumulates in GCS      | Audio still flows via WS to Core; OPFS captures all audio locally | `storage_unavailable` |
| Core → ML (stream fails)   | Transcription, talking points, tasks, speaker ID | Audio→GCS (or OPFS on WS drop), notes auto-save                   | `ml_unavailable`      |
| ML → AssemblyAI            | Transcript segments                              | Audio→GCS, notes, voice embeddings                                | `transcoder_error`    |
| ML → OpenAI                | Talking points, task extraction, summaries       | Transcript, speaker ID, audio→GCS                                 | `insight_warning`     |

<Mermaid
  chart="`sequenceDiagram
  autonumber
  participant App
  participant Core
  participant GCS
  participant ML

  Note over Core: ML WebSocket disconnects (timeout/error)
  Core->>Core: Switch to GCS-only mode (audio still persisted)
  Core-->>App: WS: RecordingErrorEvent (ml_unavailable)
  Note over App: Banner: &#x22;Live insights paused — audio is still recording.&#x22;<br/>Context panel freezes on last-received content

  loop Audio continues
      App->>Core: WS: Binary audio frame (with seq number)
      Core->>GCS: Store chunk (by sequence number)
      Note over Core: Audio chunks NOT forwarded to ML
  end

  Note over Core: ML connection restored
  Core->>ML: Reconnect ML WebSocket (StreamStartEvent with speaker states + voice profiles)
  ML->>ML: Reconstruct speaker_label → state map from pushed data
  ML->>Core: GET /transcriptions/{id}/segments?after_ms=... (rebuild LLM context)
  Core-->>ML: Recent transcript segments
  ML-->>Core: WS: StreamReadyEvent
  Core-->>App: WS: RecordingErrorEvent (ml_recovered)
  Note over App: Banner clears. Live insights resume.
  Note over Core: Resume forwarding audio to ML via WebSocket`"
/>

The diagram above shows the `ml_unavailable` path in detail. The remaining failure domains follow the same notification pattern:

**App → Core WS drop:** The browser's WebSocket `onclose` event fires. The OPFS shadow buffer captures all audio produced during the outage. On reconnect, the client sends `ResumeRecordingCommand` and Flow 16 backfills any missing GCS chunks before audio forwarding to ML resumes.

**Core → GCS failure:** Chunk writes fail — Core sends `RecordingErrorEvent (storage_unavailable)`. Audio continues flowing through the WebSocket; the OPFS shadow buffer captures the gap locally. On GCS recovery, Core sends `RecordingErrorEvent (storage_recovered, last_stored_seq)` and gap chunks are uploaded via Flow 16.

**ML → AssemblyAI failure:** Transcript segments stop arriving. Core sends `RecordingErrorEvent (transcoder_error)`. Voice embeddings are unaffected. Audio and notes continue. Missing transcript is rebuilt during post-meeting processing from the full audio in GCS.

**ML → OpenAI failure:** Talking points and task extraction stop. Core sends `RecordingErrorEvent (insight_warning)`. Transcript and speaker ID are unaffected. Missing insights are rebuilt during post-meeting processing.

### Flow 15: Audio Silence Detection [#flow-15-audio-silence-detection]

Two layers detect audio problems: the **browser** catches microphone issues locally, and **Core** catches broken streams server-side.

**Client-side (primary):** The browser monitors the `MediaStream` via Web Audio API `AnalyserNode`. If the RMS level falls below a threshold for 10 consecutive seconds, the client shows an inline notice. No server round-trip needed — this is purely a UX signal. Clears automatically when audio levels recover or the first transcript segment arrives.

**Server-side (secondary):** Core tracks time since the last audio chunk was received on the WebSocket. If no chunks arrive for 10 seconds while a session is active, Core sends a `RecordingErrorEvent`. This catches the case where the browser believes it's sending audio but the WebSocket stream is silently broken.

<Mermaid
  chart="`sequenceDiagram
  autonumber
  participant App
  participant Core

  Note over App: === Client-side detection ===
  App->>App: AnalyserNode monitors MediaStream RMS
  App->>App: RMS below threshold for 10s
  Note over App: Inline notice: &#x22;We're not detecting audio.<br/>Check your microphone.&#x22;

  alt Audio levels recover
      App->>App: RMS above threshold
      Note over App: Notice clears automatically
  else First transcript segment arrives
      Note over App: Notice clears (system is hearing audio)
  end

  Note over App: === Server-side detection ===
  Note over Core: No audio chunks received for 10s<br/>while session is active
  Core-->>App: WS: RecordingErrorEvent (no_audio_detected)
  Note over App: Inline notice (if not already showing)`"
/>

### Flow 16: Audio Gap Recovery [#flow-16-audio-gap-recovery]

When a connectivity gap occurs — either the App→Core WebSocket drops or Core cannot write to GCS — the OPFS shadow buffer accumulates all audio produced during the outage. On recovery, the client compares its OPFS buffer against the last sequence number Core successfully stored in GCS, then uploads any missing chunks via REST. Core writes each to GCS by sequence number and deduplicates — chunks already stored are skipped. This same flow runs at session stop time if any gaps remain (see Flow 9).

<Mermaid
  chart="`sequenceDiagram
  autonumber
  participant App
  participant Worker as OPFS Worker
  participant Core
  participant GCS

  Note over App,Core: WS reconnect or GCS recovery

  App->>Core: WS: ResumeRecordingCommand (last_client_seq)
  Core-->>App: WS: RecordingResumedEvent (last_stored_seq)

  Note over App: Compare last_stored_seq with OPFS buffer

  opt Gaps exist (last_stored_seq < last_client_seq)
      App->>Worker: Request buffered chunks (seq > last_stored_seq)
      Worker-->>App: Audio chunks from OPFS shadow buffer

      loop For each gap chunk
          App->>Core: Upload gap chunk (with seq number)
          Core->>GCS: Store chunk (keyed by sequence number)
          Core-->>App: 200
      end
  end

  Core-->>App: WS: GapUploadComplete (last_stored_seq updated)
  Note over App: Banner: &#x22;Reconnected. Live transcript continues from here.<br/>The full transcript (including the gap) will regenerate when the meeting ends.&#x22;
  Note over Core: Resume forwarding audio to ML`"
/>

### Flow 17: Server-Side Inactivity Timeout — **New** [#flow-17-server-side-inactivity-timeout--new]

If no audio chunks arrive for a configurable period (default: 5 minutes) while a recording session is active, Core treats the session as abandoned and triggers the same stop sequence as Flow 9. This covers the case where the user closes their laptop lid, loses power, or otherwise disappears without explicitly stopping — the WebSocket heartbeat timeout (\~60 seconds) transitions the connection to closed, but the recording resource would remain in `active` state indefinitely without this secondary timeout. The inactivity timeout prevents abandoned sessions from blocking the concurrent-session guard.

<Mermaid
  chart="`sequenceDiagram
  autonumber
  participant Core
  participant ML
  participant GCS

  Note over Core: No audio chunks received for 5 minutes<br/>while session is active (heartbeat already failed)
  Core->>Core: Mark session as abandoned
  Core->>ML: WS: DrainCommand (reason: connection_closed)
  ML-->>Core: WS: StreamDrainedEvent
  Core->>GCS: GCS Compose (whatever chunks exist)
  Core->>Core: Publish TranscriptionJob via outbox
  Note over Core: Post-meeting processing runs on<br/>whatever audio reached GCS`"
/>

### Flow 18: Background Tab Audio Continuity — **New** [#flow-18-background-tab-audio-continuity--new]

Chrome and other browsers aggressively throttle background tabs — JavaScript timers are capped at 1 execution per minute, and some WebSocket activity may be delayed. However, `MediaRecorder` itself runs on a browser-internal thread and is **not** throttled when the tab is backgrounded. The critical design choice: all audio chunk processing (sequence numbering, OPFS writes, and WebSocket sends) runs in a dedicated **Web Worker**, which is exempt from background tab throttling. The main thread only receives notifications for UI updates.

This means audio capture and transmission continue uninterrupted when the user switches to another tab. The page title changes to "● Recording…" so the user can find the tab.

### Flow 19: Batched Gap Upload — **New** [#flow-19-batched-gap-upload--new]

When a large connectivity gap occurs (e.g., 30 minutes offline = \~18,000 chunks), the client uploads gap chunks in batches rather than one-at-a-time. The client reads chunks from OPFS in batches of 50, uploads each batch as a single multipart request to `POST /meetings/{id}/recording/chunks`, and uses the `remaining_missing_sequences` in the response to drive the next batch. A determinate progress indicator shows upload progress on the Meeting Summary page. If the browser closes mid-upload, the upload resumes from where it left off on next page load (OPFS retains all chunks until `AudioChunkStoredEvent` confirms GCS receipt).

### Flow 20: Soft-Deleted Meeting During Post-Processing — **New** [#flow-20-soft-deleted-meeting-during-post-processing--new]

Meetings are soft-deleted (flagged with `deleted_at`, not removed from the database). If a user soft-deletes a meeting while post-meeting processing is running, ML's write-back calls to Core REST will encounter a soft-deleted resource. Core handles this gracefully: write-back endpoints (`PUT /transcriptions/{id}/segments`, `PUT /meetings/{id}/synthesis`, `PATCH /transcriptions/{id}/status`) check `deleted_at` and return `204 No Content` if the meeting is soft-deleted. ML treats this as success (no retry). The post-meeting processing completes silently — artefacts are written to the soft-deleted meeting's rows in the database but are never visible to the user. This avoids 404 errors, unnecessary retries, and DLQ noise.

***

## Design Decisions [#design-decisions]

Key architectural choices and their rationale. These are the "why" behind the flows above.

| Decision                                                      | Rationale                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |
| ------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **Dual-write (WS + async DB)**                                | Stream to client via WebSocket for minimum latency (\~200ms). Persist to DB asynchronously so a DB hiccup doesn't block the live experience.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
| **Echo-suppressed optimistic mutations**                      | Client updates local state immediately (optimistic), sends via REST, then suppresses the returning WebSocket `EntityChanged` event using a session identifier. Gives instant UI feedback without double-rendering.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |
| **WebSocket for Core↔ML**                                     | Audio flows upstream as binary frames and insights flow downstream as CloudEvents text frames on the same long-lived WebSocket. Supports bidirectional control events (DrainCommand, BackpressureEvent), replay cursors for reconnection, and speaker state push — capabilities that would require a separate control channel with HTTP streaming. Core acts as a protocol bridge: browser-facing WebSocket on the client side, service-to-service WebSocket on the ML side.                                                                                                                                                                                                                                                                                                                                                                                                                         |
| **OPFS always-on shadow buffer**                              | Every audio chunk is written to the browser's Origin Private File System via `createSyncAccessHandle()` in a dedicated Web Worker before (or instead of) being sent to Core. The buffer runs unconditionally — it captures audio regardless of Core or GCS connectivity. This separates audio capture (which must never fail) from transport (which can be retried). The buffer is cleared only after Core confirms all chunks are safely in GCS.                                                                                                                                                                                                                                                                                                                                                                                                                                                    |
| **Chunk-based GCS writes + hierarchical compose**             | Each audio chunk is stored as a separate GCS object keyed by sequence number (`meetings/{id}/chunks/{seq:08d}.webm`) rather than as a single streaming upload. This enables gap recovery: any chunk missed during a connectivity failure can be backfilled by sequence number. At session end, Core composes the chunk objects into the final `audio.webm` using GCS Compose — hierarchically in groups of ≤32 for recordings that exceed GCS's 32-object compose limit.                                                                                                                                                                                                                                                                                                                                                                                                                             |
| **GCS as the indestructible recording**                       | Audio always reaches GCS eventually, even across connectivity failures, because the OPFS shadow buffer guarantees local capture. Everything else (transcript, insights, tasks) can be rebuilt from the audio during post-meeting processing.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
| **Task preservation for live recordings**                     | Post-meeting re-processing replaces transcript segments and regenerates synthesis, but must not regenerate tasks. Users create and edit tasks during the live session — clobbering them would destroy user work. The Pub/Sub job carries a flag to skip task extraction when triggered from a live recording.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |
| **Transcription status state machine**                        | `pending → transcribing → synthesizing → completed` gives the client granular progress without polling. Each transition fires an `EntityChanged` event. Fewer states reduce complexity while still distinguishing the two user-visible phases: transcript generation and insight synthesis.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
| **Signed URL with client-side rotation**                      | Client streams audio directly from GCS (no Core proxy). Signed URLs expire after 1 hour. Client sets a timer to refresh before expiry for seamless playback.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
| **Dual-layer silence detection**                              | Client-side `AnalyserNode` catches mic issues instantly (no latency). Server-side chunk timeout catches broken streams the client can't detect. Neither alone covers both cases.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |
| **Concurrent session as pre-condition (not a separate flow)** | The check is a guard on Flow 1, not an independent workflow. Client checks on page load; Core enforces atomically before starting a session.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
| **Batched LLM query (talking points + tasks)**                | A single structured output call extracts both talking points and tasks from the same prompt. The rolling transcript buffer is loaded once into the prompt cache; adding a second extraction task to the same query costs almost nothing vs. a separate call.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
| **Rolling transcript buffer with prompt caching**             | ML appends each finalised segment to an in-memory buffer. Each call to the LLM sends `[system instructions] [schema] [anchor] [recent window] [latest segment] [existing task list]`. OpenAI caches from the prompt start, so the cached region (`[system] + [schema] + [anchor]`) grows as the anchor grows. When the context budget is exceeded, segments are dropped from the **oldest non-anchored position** (between anchor and recent window) — never from the front. Dropping from the front would change the content right after the static prefix, invalidating the entire transcript cache. Dropping from the middle preserves the cached prefix and keeps the most recent context intact. The existing task list in the dynamic suffix lets the LLM deduplicate naturally — it returns only tasks not already in the list, eliminating the need for a separate semantic similarity step. |
| **Speaker state externalised to `meeting_speaker_states`**    | The in-memory `speaker_label → state` map is mirrored to Postgres on meaningful transitions (`matched`, `exhausted`, `manual`). Core pushes current speaker states and voice profiles to ML on every session start and WebSocket reconnect (via `StreamStartEvent`), so ML reconstructs its map without a pull endpoint. Attempts are tracked in-memory only; an unmatched speaker restarting at 0 on recovery is acceptable since it retries a bounded number of times before exhausting again. Manual overrides written by Core (Flows 7/8) are immediately visible on any reconnect, so user resolutions are never lost or re-overridden by voice matching.                                                                                                                                                                                                                                       |
| **Progressive speaker matching with lock-on**                 | ML tries to match each diarised speaker to an enrolled voice profile. Once a confident match is found, the speaker label is locked — no further voice comparison is done for that speaker. Unknown speakers fail fast after a bounded number of attempts. Manual state (set by user labelling) takes precedence and cannot be overridden by voice matching. Voice profile enrichment from session embeddings is deferred to post-meeting processing.                                                                                                                                                                                                                                                                                                                                                                                                                                                 |
| **Sticky session affinity (not a backplane)**                 | Load balancer routes all WebSocket frames for a session to the same Core pod. No pod-to-pod event routing exists today. This is a known scaling constraint documented separately as a problem statement.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |
| **Session not resumable after tab close (v1)**                | OPFS data persists beyond tab close, but the recording session does not. If the user closes the tab, the session ends and post-meeting processing runs on whatever audio reached GCS. Session resume is a future enhancement, captured as a separate problem statement.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              |
| **WebSocket heartbeat (30s ping/pong)**                       | Detects zombie connections in seconds rather than waiting for TCP timeout (minutes). Two missed pongs trigger the client-side OPFS-bridges-the-gap path (Flow 16).                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |
| **Chunk integrity via CRC32**                                 | Each chunk carries a CRC32 checksum on the hot path (\~10 chunks/sec). CRC32 is sufficient for detecting transmission corruption at this frequency. Core verifies on receipt; corrupted chunks are re-requested from OPFS during gap recovery. OPFS stores each chunk with a CRC32 integrity envelope so corruption can be detected on read. Gap recovery uploads via REST use the same CRC32.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |
| **Pre-warm AssemblyAI on mic permission**                     | The upstream streaming session opens when the browser grants mic access, not when the first audio chunk arrives. Saves \~500–800ms on first-segment latency.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
| **Sequential post-meeting pipeline**                          | Post-meeting processing is always sequential: batch transcription first (replaces live segments with higher-accuracy results), then synthesis (headline, summary, topics, talking points). Synthesis depends on the final transcript, so the stages cannot be parallelised. Task extraction is skipped for live recordings (tasks were captured during the session). Each stage updates the transcription status, giving the client granular progress.                                                                                                                                                                                                                                                                                                                                                                                                                                               |
| **Talking-point cadence: per finalised segment**              | An LLM call fires on every finalised transcript segment. The rolling transcript buffer is already loaded in the prompt cache, so the marginal cost of each call is low (dynamic suffix only). This gives the fastest possible insight updates — the user sees talking points and tasks within seconds of speech. If cost becomes a concern at scale, the cadence can be relaxed to a batched window without changing the contract.                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |
| **GCS chunk lifecycle: 24h TTL after compose**                | Once `audio.webm` is composed, chunk objects (`chunks/{seq:08d}.webm`) are no longer needed. A GCS lifecycle rule deletes them 24 hours after composition. The delay provides a safety window for debugging or re-composition.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |

***

## Boundary Inventory [#boundary-inventory]

Every boundary shown in the diagrams above. Each becomes a contract on the [Contracts](contracts) page.

| Boundary                 | Flows  | From → To        | Protocol                         | Data shape                                                                                                                                                                                                |
| ------------------------ | ------ | ---------------- | -------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| Meeting CRUD             | 1, 4   | App → Core       | REST                             | Create meeting (live recording); patch meeting notes (echo-suppressed)                                                                                                                                    |
| Recording commands       | 1, 9   | App → Core       | WebSocket                        | `StartRecordingCommand`, `StopRecordingCommand`                                                                                                                                                           |
| Audio streaming          | 2      | App → Core → ML  | WS (binary) → ML WS (binary)     | Raw audio chunks (sequence-numbered, Core enriches with ml\_session\_id)                                                                                                                                  |
| Live insights            | 3a, 3b | ML → Core → App  | ML WS (CloudEvents) → Browser WS | Talking points, tasks, embeddings, speaker matches, speaker exhausted                                                                                                                                     |
| Task CRUD                | 5, 6   | App → Core       | REST                             | Task create/update/delete (idempotent create, echo-suppressed; cascading sub-task nesting)                                                                                                                |
| Person creation          | 8      | App → Core       | REST                             | Create person                                                                                                                                                                                             |
| Speaker labels           | 7, 8   | App → Core       | REST                             | Speaker-to-person assignment, meeting-scoped (always references an existing person)                                                                                                                       |
| Notes auto-save          | 4      | App → Core       | REST                             | Meeting notes patch (debounced, echo-suppressed)                                                                                                                                                          |
| OPFS gap upload          | 9, 16  | App → Core       | REST                             | Sequence-numbered audio chunks from OPFS shadow buffer; Core deduplicates by sequence number                                                                                                              |
| Degraded mode            | 14, 16 | Core → App       | WebSocket                        | `RecordingErrorEvent` (error code variants: `ml_unavailable`, `ml_recovered`, `storage_unavailable`, `storage_recovered`, `insight_warning`, `transcoder_error`, `no_audio_detected`, `session_conflict`) |
| Duration warning         | 10     | Core → App       | WebSocket                        | `RecordingDurationWarningEvent`                                                                                                                                                                           |
| Concurrent session check | 1      | App → Core       | REST                             | Active session read (read-only guard, no mutation)                                                                                                                                                        |
| Transcription status     | 12     | ML → Core → App  | REST → WebSocket                 | Transcription status transitions + `EntityChanged (transcription)`                                                                                                                                        |
| Signed URL               | 13     | App → Core → GCS | REST → GCS signed URL            | Signed URL fetch (404 while processing, 200 when ready; 1-hour expiry, client-side rotation)                                                                                                              |
| Post-meeting trigger     | 9, 10  | Core → ML        | Pub/Sub                          | `TranscriptionJob` (published after drain completes and audio is composed)                                                                                                                                |
| Synthesis write-back     | 11     | ML → Core        | REST                             | Transcript segments replace-all; meeting headline; synthesis artefacts (summary, topics, talking points); system-generated tasks                                                                          |


# Overview (/docs/work/meeting-recording/tdd)


# Technical Design Document [#technical-design-document]

> **Status**: Agreed
>
> **Author**: Ryan Nel
>
> **Date**: 2026-05-01

## Success Criteria [#success-criteria]

| Criterion                                                                                                  | Measured by                                                                                  |
| ---------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------- |
| User can start a live recording from the browser and see real-time transcript within 2 seconds of speech   | End-to-end latency from mic input to transcript segment on screen                            |
| Audio is never lost, even across connectivity failures                                                     | Zero-gap rate: all chunks reach GCS via direct upload or OPFS gap recovery                   |
| Post-meeting artefacts (headline, summary, topics, talking points) reach final quality without user action | Transcription status reaches `completed` and all synthesis artefacts are present             |
| Live session degrades gracefully — audio capture continues even if ML or insights fail                     | Recording produces a complete audio file even when ML is unavailable for part of the session |
| A single active recording per user at any time                                                             | Concurrent session guard enforced client-side and server-side                                |

## Architectural Approach [#architectural-approach]

The system connects browser-captured microphone audio to the existing ML pipeline (AssemblyAI transcription, OpenAI insights) via a streaming architecture with three layers of durability.

**Core path:** Browser → Core (WebSocket, binary frames) → ML (WebSocket) → AssemblyAI (real-time streaming). Insights flow back: ML → Core → Browser on the same WebSocket connections.

**Durability strategy:** Audio is captured at three levels simultaneously:

1. **OPFS shadow buffer** — every chunk is written to the browser's Origin Private File System via a dedicated Web Worker before transport. This runs unconditionally.
2. **GCS chunk storage** — each chunk is stored as a separate GCS object keyed by sequence number. Gap recovery backfills any missing chunks from OPFS.
3. **Post-meeting reprocessing** — the composed audio file is batch-transcribed at higher accuracy, replacing live segments entirely.

**Key architectural decisions:**

* **Dual-write** (WebSocket for latency, async DB for durability) — a DB hiccup doesn't block the live experience
* **Echo-suppressed optimistic mutations** — instant UI feedback without double-rendering
* **Chunk-based GCS writes with hierarchical compose** — enables gap recovery by sequence number; compose at session end
* **Sequential post-meeting pipeline** — batch transcription must complete before synthesis runs (synthesis depends on final transcript)
* **Task preservation** — post-meeting processing skips task extraction for live recordings to avoid clobbering user-created tasks

For the full rationale on all decisions, see the [Design Decisions table in Data Flow](data-flow#design-decisions).

## Constraints [#constraints]

Architectural constraints discovered during design, in addition to the [no-gos in the Pitch](../pitch#no-gos):

* **Sticky session affinity** — all WebSocket frames for a session route to the same Core pod. No pod-to-pod event routing (backplane) exists. This is a known scaling constraint, captured as a [problem statement](../../../problem-statements/backplane).
* **Session not resumable after tab close (v1)** — OPFS data persists, but the recording session does not. Tab close ends the session. Session recovery is a [separate problem statement](../../../problem-statements/session-recovery).
* **Desktop browsers only (this bet)** — Chrome/Edge primary, Safari 17+ best-effort. Mobile architecture should not be precluded.
* **Single AssemblyAI model** — no custom vocabulary or domain-specific tuning.
* **5-minute WebSocket replay buffer** — reconnects beyond 5 minutes require full REST re-fetch. Captured as a [problem statement](../../../problem-statements/replay-buffer-optimization).

## Open Questions [#open-questions]

| Question                                                                | Owner | Status                               |
| ----------------------------------------------------------------------- | ----- | ------------------------------------ |
| Safari 17+ `createSyncAccessHandle()` support — confirmed in workers?   | App   | To verify during milestone 1         |
| GCS compose latency for long recordings (10k+ chunks) — need benchmarks | Core  | To measure during milestone 2        |
| AssemblyAI v3 turn-based API migration timeline                         | ML    | Monitoring — no action needed for v1 |

***

## Navigation Map [#navigation-map]

| Document                 | What it covers                                                                        |
| ------------------------ | ------------------------------------------------------------------------------------- |
| [UI Design](ui-design)   | Wireframes, screen states, and interaction patterns                                   |
| [Data Flow](data-flow)   | 20 sequence diagrams across live session, user mutations, and post-meeting processing |
| [Contracts](contracts)   | API shapes for every boundary: REST, WebSocket, Pub/Sub, binary audio frames          |
| [Schemas](schemas)       | Database table designs for Core and ML                                                |
| [Milestones](milestones) | Build plan broken into shippable slices                                               |

***

## Architecture Scaffolding [#architecture-scaffolding]

To maintain structural consistency and immediate integration with the test suite, always use the CLI to generate the remaining TDD components.

**Add a Milestone**

```bash
./dev new milestone meeting-recording <milestone-slug>
```

**Add a Domain Slice** (Automatically connects to pytest suite)

```bash
./dev new slice meeting-recording <milestone-slug> <domain> <slice-slug>
```

**Design a Contract**

```bash
./dev new contract meeting-recording <service> <protocol>
```

**Design a Database Schema**

```bash
./dev new schema meeting-recording <service> <database-tech>
```


# UI Design (/docs/work/meeting-recording/tdd/ui-design)


# UI Design: Meeting Recording [#ui-design-meeting-recording]

***

## 1. Screens [#1-screens]

### The Entry Point [#the-entry-point]

The existing "New Meeting" button becomes a dropdown with two options: **Upload File** (existing) and **Start Live Recording** (new). No new navigation patterns — we're extending what's already there.

<img alt="Start Recording Dropdown" src="__img0" />

***

### Live Recording View [#live-recording-view]

A focused, distraction-free workspace. The user's private notes take centre stage — the system's output sits alongside as an ambient feed, never competing for attention.

<img alt="Active Recording UI Layout" src="__img1" />

**Layout:**

* **Recording Banner (top):** Pulsing recording indicator, elapsed timer, Stop Recording button. If the ML connection drops, a degraded-mode warning appears inline here.
* **Left Column (primary) — Private Notes:** A single rich-text scratchpad per meeting. Auto-saves — no save button. The user writes freely during the session and can continue editing after.
* **Right Column (secondary) — Context Panel:** Live transcript stream (auto-scrolling), real-time talking points, and extracted tasks.

**States:**

| State                      | What the user sees                                                                                                                                                                                                                                                                    |
| -------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **Mic Permission**         | Browser permission dialog. Behind it: the recording view skeleton — two-column layout with grey placeholder blocks.                                                                                                                                                                   |
| **Connecting**             | Skeleton visible. Banner shows "Connecting…" with a spinner instead of the pulsing dot.                                                                                                                                                                                               |
| **Awaiting First Segment** | Banner transitions to `● Recording 00:00`. Transcript area shows "Listening…" in muted text. Notes scratchpad is active and ready.                                                                                                                                                    |
| **Active**                 | Transcript streaming in the context panel. Talking points and tasks arriving. User writing notes.                                                                                                                                                                                     |
| **Degraded**               | ML connection drops — transcript and sidebar freeze on the last-received content. An inline warning appears above: "Live insights paused — audio is still recording." Notes scratchpad and audio capture continue uninterrupted. Clears automatically on reconnect.                   |
| **Connectivity Lost**      | The browser loses its connection to the server. Banner: "Connection lost — audio is being saved on this device. Nothing will be lost." Context panel freezes. Notes scratchpad remains active. A gap placeholder appears in the transcript at the point of the last received segment. |
| **Reconnected**            | Connection restored. Banner: "Reconnected. Live transcript continues from here. The full transcript (including the gap) will regenerate when the meeting ends." Gap placeholder remains visible in the transcript until post-meeting processing completes. Live insights resume.      |
| **Duration Warning**       | At T-10 minutes before the 4-hour limit, a non-blocking banner: "Recording will automatically stop in 10 minutes."                                                                                                                                                                    |

**Key Interactions:**

* **Auto-scroll:** Transcript auto-scrolls to newest segment. Stops if the user scrolls up manually. A "Jump to latest →" button appears.
* **Speaker labelling:** Each transcript segment shows a speaker label. User can click it to reassign to a known person or add a new one. Reassignments update all segments from that speaker.
* **Interim vs. final segments:** Interim segments display muted/italic and are replaced in-place (no layout shift) when their final version arrives.
* **Tasks:** Each task has a description (required), assignee (optional), and due date (optional). Tasks can be nested. Tasks are labelled as either `system` or `user` generated. If the user edits a system-generated task, it becomes a user task. Tasks have a completion state (checkbox). Uses the standard task component.

***

### Meeting Summary — Overview [#meeting-summary--overview]

Post-stop, the UI transitions to the Meeting Summary page. Two tabs at the top: **Overview** (active) and **Transcript**.

<img alt="Meeting Summary Overview Tab" src="__img2" />

**Layout (single column, stacked):**

* **Summary:** Auto-generated headline, summary, and topics.
* **Tasks:** All tasks — both user-created and system-extracted.
* **Notes:** The user's private notes from the live session, still editable.

**States:**

| State             | What the user sees                                                                                                                                                                  |
| ----------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **Re-processing** | Summary, tasks, and notes visible immediately. A subtle progress indicator shows the background job is upgrading accuracy. Content updates in real time as each artefact completes. |
| **Complete**      | All artefacts finalised. No progress indicator.                                                                                                                                     |

**Key Interactions:**

* **Edit notes:** Notes remain fully editable after the session — same scratchpad, no mode change.
* **Task CRUD:** User can add, edit, and delete tasks. Editing a system-generated task promotes it to a user task. Tasks can be marked complete, nested under other tasks, and assigned to people with optional due dates.

***

### Meeting Summary — Transcript [#meeting-summary--transcript]

The **Transcript** tab with an audio player pinned to the top.

<img alt="Meeting Summary Transcript Tab" src="__img3" />

**Layout:**

* **Audio Player (fixed header):** Play/pause, ±15s skip, scrub bar, current/total time, playback speed (0.5×, 1×, 1.5×, 2×). Audio streams — playback begins before the full file downloads.
* **Transcript:** Segments highlighted in sync with audio playback. Clicking a segment seeks to that moment.

**States:**

| State                 | What the user sees                                                                                |
| --------------------- | ------------------------------------------------------------------------------------------------- |
| **Audio Processing**  | Player disabled (greyed out): "Audio is still being processed." Enables automatically when ready. |
| **Ready**             | Full player controls active. Transcript clickable and synced.                                     |
| **Audio Unavailable** | Player error state: "Audio unavailable." Auto-retries with a fresh URL.                           |

**Key Interactions:**

* **Sync'd highlighting:** The transcript segment matching current playback time is visually highlighted. Auto-scrolls to keep it visible.
* **Click-to-seek:** Clicking any transcript segment seeks the audio to that segment's start time.
* **Speaker labelling:** Each segment shows a speaker label. User can click to reassign or add a new person. Reassignments update all segments from that speaker.

***

## 2. User Journeys [#2-user-journeys]

### Live Recording [#live-recording]

```text
[New Meeting ▾] → Start Live Recording
       │
       ▼
  Mic Permission Prompt ──denied──→ Blocking modal (links to browser settings)
       │ granted
       ▼
  Connecting → Live Recording View
       │
       │  ← user speaks, system streams transcript, talking points, tasks
       │  ← user writes notes, adds tasks, and labels transcript segments
       │
       ▼
  [Stop Recording]
       │
       ▼
  "Ending session…" → Meeting Summary (Overview tab)
       │
       └──→ Background re-processing upgrades all artefacts (except tasks)
```

### Post-Meeting Review [#post-meeting-review]

```text
  Meeting Summary — Overview tab
       │
       │  ← review summary, edit notes, add/edit/delete tasks
       │
       ├──→ Transcript tab
       │      │
       │      │  ← play audio, follow along with sync'd transcript
       │      │  ← click segment to seek, label speakers
       │      │
       │      └──→ back to Overview tab
```

***

## 3. Edge Cases [#3-edge-cases]

| Scenario                            | Behaviour                                                                                                                                                                                                                                             |
| ----------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **Concurrent session**              | If a recording is already active, "Start Live Recording" is disabled with a tooltip.                                                                                                                                                                  |
| **ML connection drops**             | Transcript and sidebar freeze on last-received content. Inline warning appears. Notes and audio capture continue. Recovery is automatic.                                                                                                              |
| **Browser loses server connection** | Audio saved to OPFS on the device — nothing is lost. Banner: "Connection lost — audio is being saved on this device." Gap placeholder appears in the transcript at the disconnect point. On reconnect, audio uploads automatically in the background. |
| **GCS temporarily unavailable**     | Audio continues flowing; a gap accumulates in cloud storage. Client shows connectivity degraded banner. OPFS buffer captures the gap. Clears automatically when GCS recovers; gap backfilled from OPFS.                                               |
| **Browser tab backgrounded**        | Recording continues. Page title changes to "● Recording…" so the user can find the tab.                                                                                                                                                               |
| **Long session (60+ min)**          | Transcript must virtualize — thousands of segments without virtual scrolling will freeze the browser.                                                                                                                                                 |
| **Auto-stop at 4 hours**            | System stops recording and runs the standard post-meeting pipeline automatically.                                                                                                                                                                     |
| **No audio detected (10s)**         | Inline notice: "We're not detecting audio. Check your microphone." Clears on first segment.                                                                                                                                                           |
| **Same user, multiple tabs**        | Only one tab may control a recording. If a recording is active in another tab, the second tab shows "Recording active in another tab" with a link to switch. No takeover — the original tab owns the session.                                         |


# Data Flow (/docs/work/delivered/live-capture/03-tdd/data-flow)


{/* LLM-Context: TL;DR:
  Data flow for the Meeting Recording bet.
  8 flows: Start Recording, Live Audio→Transcription, Live Talking Points (per segment),
  Live Task Extraction (~60s), Live Speaker ID, User Creates Task, User Labels Speaker, Stop Recording,
  Post-Meeting Processing, Audio Playback (signed URL direct to GCS).
  Each arrow = contract boundary + sequencing constraint. All contracts are in the Contracts page.
  */}

# Data Flow [#data-flow]

For each step in the [User Flow](user-flow), this page draws what calls what: which service initiates, which responds, what data crosses each boundary. Read each arrow two ways: it is a **contract boundary** (what shape the data takes) and a **sequencing constraint** (downstream cannot build until the upstream contract is published).

## System Context [#system-context]

<Mermaid
  chart="`graph TB
  USER[&#x22;👤 User&#x22;] --> APP[&#x22;wordloop-app<br/>(Next.js)&#x22;]
  APP -- &#x22;REST (mutations)&#x22; --> CORE[&#x22;wordloop-core<br/>(Go)&#x22;]
  APP -. &#x22;WebSocket<br/>(events + audio)&#x22; .-> CORE
  CORE --> DB[(Postgres)]
  CORE --> GCS[(&#x22;☁️ Cloud Storage<br/>(audio files)&#x22;)]
  CORE -. &#x22;HTTP stream<br/>(audio + segments)&#x22; .-> ML[&#x22;wordloop-ml<br/>(Python)&#x22;]
  ML -. &#x22;HTTP stream<br/>(segments + insights)&#x22; .-> CORE
  ML --> AAI[&#x22;🎙️ AssemblyAI<br/>(transcription)&#x22;]
  ML --> LLM[&#x22;🤖 OpenAI<br/>(insights)&#x22;]
  CORE -. &#x22;Pub/Sub<br/>(async jobs)&#x22; .-> ML`"
/>

***

## Flow 1: Start Recording [#flow-1-start-recording]

<Mermaid
  chart="`sequenceDiagram
  autonumber
  participant App
  participant Core
  participant DB
  participant GCS
  participant ML
  participant AAI as AssemblyAI

  App->>Core: POST /meetings (source_type: recording)
  Core->>DB: INSERT meeting
  Core-->>App: 201 Meeting { id }
  Note over App: UI shows new meeting

  App->>Core: WS: StartRecordingCommand { meeting_id, audio_config }
  Core->>GCS: Open write stream (meetings/{id}/audio.webm)
  Core->>ML: POST /streaming/start { meeting_id, audio_config }
  ML->>AAI: Open real-time transcription session
  ML-->>Core: 200 { session_id }
  Core-->>App: WS: RecordingStartedEvent { meeting_id, session_id }
  Note over App: UI shows recording indicator`"
/>

## Flow 2: Live Audio → Transcription (Lowest Latency Path) [#flow-2-live-audio--transcription-lowest-latency-path]

Audio flows from the browser microphone through Core and ML to AssemblyAI. Transcript segments return via the **streaming HTTP response** — the same connection ML uses to receive audio. This is a bidirectional HTTP stream: audio chunks flow upstream, segments and insights flow downstream.

Core streams segments directly to the client via WebSocket for minimum latency, and persists them to the database asynchronously in the background.

<Mermaid
  chart="`sequenceDiagram
  autonumber
  participant App
  participant Core
  participant DB
  participant GCS
  participant ML
  participant AAI as AssemblyAI

  loop Every ~100ms audio chunk
      App->>Core: WS: Binary audio frame
      par Store for recovery
          Core->>GCS: Append audio chunk (streaming write)
      and Forward to ML
          Core->>ML: Audio chunk (via HTTP stream)
          ML->>AAI: Forward audio chunk
      end
  end

  Note over AAI: Transcription produced

  AAI-->>ML: Transcript segment (interim or final)
  ML-->>Core: Stream back: TranscriptSegment (via HTTP stream response)

  par Lowest latency: stream to client
      Core-->>App: WS: TranscriptSegmentEvent { segment }
      Note over App: Render segment immediately
  and Persist asynchronously
      Core->>Core: Enqueue async DB write
      Core->>DB: INSERT transcript_segment (background)
  end`"
/>

## Flow 3a: Live Talking Points (Fast — Per Finalised Segment) [#flow-3a-live-talking-points-fast--per-finalised-segment]

Talking points update on every finalised transcript segment. ML streams them back through the same HTTP stream as transcript segments. Core forwards them to the client via WebSocket and persists to the database asynchronously — the same dual-write pattern as transcript segments.

<Mermaid
  chart="`sequenceDiagram
  autonumber
  participant App
  participant Core
  participant DB
  participant ML
  participant LLM as OpenAI

  Note over ML: Final segment received

  ML->>LLM: Extract/update talking point from latest segment
  LLM-->>ML: TalkingPoint { content, is_final: false }

  ML-->>Core: Stream back: TalkingPoint (via HTTP stream response)

  par Lowest latency: stream to client
      Core-->>App: WS: EntityChanged { entity: talking_point, action: created/updated }
      Note over App: SWR revalidates, shows draft talking point
  and Persist asynchronously
      Core->>Core: Enqueue async DB write
      Core->>DB: UPSERT talking_point (draft)
  end`"
/>

## Flow 3b: Live Task Extraction (Slow — Every \~60s) [#flow-3b-live-task-extraction-slow--every-60s]

Task extraction runs on a slower cadence. ML buffers segments and periodically checks for action items. Tasks also stream back through the HTTP stream, following the same dual-write pattern.

<Mermaid
  chart="`sequenceDiagram
  autonumber
  participant App
  participant Core
  participant DB
  participant ML
  participant LLM as OpenAI

  Note over ML: ~60s of segments accumulated

  ML->>LLM: Extract tasks from recent segments
  LLM-->>ML: Task[] (if any detected)

  loop For each new task
      ML-->>Core: Stream back: Task { content, source: system } (via HTTP stream response)

      par Stream to client
          Core-->>App: WS: EntityChanged { entity: task, action: created }
          Note over App: SWR revalidates, shows new task
      and Persist asynchronously
          Core->>Core: Enqueue async DB write
          Core->>DB: INSERT task (background)
      end
  end`"
/>

## Flow 3c: Live Speaker Identification (Per Segment) [#flow-3c-live-speaker-identification-per-segment]

Speaker identification is built into the live transcription flow. For every segment, ML extracts a voice embedding and stores it on Core. It then attempts to match the embedding against enrolled voice profiles.

When a user later labels an AssemblyAI speaker label (e.g. "Speaker A") as a known Person, the system uses all segments with that speaker label to enrich that person's voice profile for improved future matching.

<Mermaid
  chart="`sequenceDiagram
  autonumber
  participant ML
  participant Core
  participant DB

  Note over ML: Segment received with audio

  ML->>ML: Extract voice embedding (feature_vector)

  par Store features on segment
      ML-->>Core: Stream back: SegmentFeatures { segment_id, feature_vector }
      Core->>Core: Enqueue async DB write
      Core->>DB: UPDATE transcript_segment SET feature_vector
  and Attempt speaker match
      ML->>Core: GET /people?has_voice_profile=true (cached)
      Core-->>ML: Person[] with voice_vectors
      ML->>ML: Compare embedding against known profiles
      alt Match found (score > threshold)
          ML-->>Core: Stream back: SpeakerMatch { segment_id, person_id }
          Core->>DB: UPDATE transcript_segment SET person_id
          Core-->>Core: WS: EntityChanged { entity: transcript_segment, action: updated }
          Note over Core: Client sees person name instead of speaker label
      else No match
          Note over ML: Keep raw speaker_label
      end
  end`"
/>

## Flow 4: User Creates Task During Recording [#flow-4-user-creates-task-during-recording]

Standard Optimistic Mutation with Echo-Suppressed Streaming. The user's task is written via REST (not the streaming path) since it's a user-initiated mutation.

<Mermaid
  chart="`sequenceDiagram
  autonumber
  participant App
  participant Core
  participant DB

  App->>App: Optimistic: add task to local state
  App->>Core: POST /meetings/{id}/tasks { content, source: user }<br/>Headers: Client-Session-Id, Idempotency-Key
  Core->>DB: INSERT task
  Core-->>App: 201 Task { id }
  Note over App: Replace optimistic with server entity
  Core-->>App: WS: EntityChanged { entity: task, action: created, sourceClientId }
  Note over App: Echo suppressed (sourceClientId matches)`"
/>

## Flow 5: User Labels Speaker as Person [#flow-5-user-labels-speaker-as-person]

When a user identifies "Speaker A" as a known Person, the system enriches that person's voice profile using all segments attributed to that speaker label.

<Mermaid
  chart="`sequenceDiagram
  autonumber
  participant App
  participant Core
  participant DB

  App->>Core: POST /meetings/{id}/speaker-labels { speaker_label: &#x22;Speaker A&#x22;, person_id: &#x22;uuid&#x22; }
  Core->>DB: UPDATE transcript_segments SET person_id WHERE speaker_label = &#x22;Speaker A&#x22;
  Core->>DB: SELECT feature_vectors FROM transcript_segments WHERE speaker_label = &#x22;Speaker A&#x22;
  Core->>Core: Aggregate feature vectors → update person voice profile
  Core->>DB: UPDATE people SET voice_vector = aggregated_embedding
  Core-->>App: 200 OK
  Core-->>App: WS: EntityChanged { entity: transcript_segment, action: updated }
  Core-->>App: WS: EntityChanged { entity: person, action: updated }
  Note over App: All segments now show person name<br/>Future meetings benefit from improved voice profile`"
/>

## Flow 6: Stop Recording [#flow-6-stop-recording]

<Mermaid
  chart="`sequenceDiagram
  autonumber
  participant App
  participant Core
  participant DB
  participant GCS
  participant PubSub
  participant ML
  participant AAI as AssemblyAI

  App->>Core: WS: StopRecordingCommand { meeting_id }
  Core->>GCS: Finalise audio write stream
  Core->>Core: Close HTTP stream to ML
  Core-->>App: WS: RecordingStoppedEvent { meeting_id }
  Note over App: UI transitions to meeting detail view

  Core->>PubSub: Publish MeetingSessionTerminated { meeting_id, session_id }
  PubSub-->>ML: Consume MeetingSessionTerminated
  ML->>AAI: Close transcription session (drain final segments)
  AAI-->>ML: Final transcript segments
  ML->>Core: POST final segments via REST (session stream is closed)
  Core->>DB: INSERT final segments
  Core-->>App: WS: EntityChanged { entity: transcript_segment }

  Note over Core: Trigger post-meeting processing
  Core->>PubSub: Publish TranscriptionJob { meeting_id, storage_path, skip_tasks: true }
  Note over PubSub: Same job as file upload, but tasks are skipped`"
/>

## Flow 7: Post-Meeting Processing (Automatic, via Pub/Sub) [#flow-7-post-meeting-processing-automatic-via-pubsub]

Post-meeting processing runs automatically via the shared `TranscriptionJob` Pub/Sub worker. For live recordings, the job is published with `skip_tasks: true` to preserve tasks captured during the session.

The worker:

1. Batch-transcribes the full audio from GCS (higher accuracy)
2. Replaces transcript segments with the improved results
3. Generates headline, summary, topics, and finalises talking points (`is_final: true`)
4. Extracts tasks when `skip_tasks: false` (file upload flow only)

<Mermaid
  chart="`sequenceDiagram
  autonumber
  participant PubSub
  participant ML
  participant GCS
  participant AAI as AssemblyAI
  participant LLM as OpenAI
  participant Core
  participant DB
  participant App

  PubSub-->>ML: Consume TranscriptionJob { meeting_id, storage_path, skip_tasks }

  ML->>GCS: Download full audio file
  ML->>AAI: Batch transcribe (offline, higher accuracy)
  AAI-->>ML: Full transcript (final segments with speaker labels)

  ML->>Core: PUT /meetings/{id}/transcriptions/{tid}/segments (replace all)
  Core->>DB: Replace transcript segments
  Core-->>App: WS: EntityChanged { entity: transcript_segment }

  ML->>LLM: Generate headline
  ML->>Core: PATCH /meetings/{id} { headline }
  Core->>DB: UPDATE meeting SET headline
  Core-->>App: WS: EntityChanged { entity: meeting, action: updated }

  ML->>LLM: Generate summary + topics + finalise talking points
  ML->>Core: PUT /meetings/{id}/synthesis { summary, topics, talking_points (is_final: true) }
  Core->>DB: UPSERT synthesis
  Core-->>App: WS: EntityChanged { entity: meeting, action: updated }

  alt skip_tasks = false (file upload flow)
      ML->>LLM: Extract tasks
      ML->>Core: POST /meetings/{id}/tasks { content, source: system }
      Core->>DB: INSERT tasks
      Core-->>App: WS: EntityChanged { entity: task, action: created }
  else skip_tasks = true (live recording flow)
      Note over ML: Tasks preserved from live session
  end

  Note over App: All insights now final quality`"
/>

## Flow 8: Audio Playback (Signed URL Direct to GCS) [#flow-8-audio-playback-signed-url-direct-to-gcs]

Core generates a short-lived signed URL. The client streams audio directly from Cloud Storage using that URL, with standard HTTP range requests for seeking.

<Mermaid
  chart="`sequenceDiagram
  autonumber
  participant App
  participant Core
  participant GCS

  App->>Core: GET /meetings/{id}/audio-url
  Core->>GCS: Generate signed URL (expiry: 1 hour)
  GCS-->>Core: Signed URL
  Core-->>App: 200 { url: &#x22;https://storage.googleapis.com/...?X-Goog-Signature=...&#x22;, expires_at: &#x22;...&#x22; }

  App->>GCS: Stream audio (Range header, signed URL)
  GCS-->>App: 206 Partial Content (audio/webm)
  Note over App: HTML5 Audio element plays with seeking

  App->>App: On timeupdate: highlight segment where start_ms ≤ currentTime < end_ms
  App->>App: On segment click: audio.currentTime = segment.start_ms / 1000`"
/>

***

## Boundary Inventory [#boundary-inventory]

Every boundary shown in the diagrams above. Each becomes a contract on the [Contracts](contracts) page.

| Boundary             | From → To        | Protocol                         | Data shape                                           |
| -------------------- | ---------------- | -------------------------------- | ---------------------------------------------------- |
| Meeting CRUD         | App → Core       | REST                             | `POST/PATCH /meetings`                               |
| Recording commands   | App → Core       | WebSocket                        | `StartRecordingCommand`, `StopRecordingCommand`      |
| Audio streaming      | App → Core → ML  | WebSocket (binary) → HTTP stream | Raw audio chunks                                     |
| Live insights        | ML → Core → App  | HTTP stream → WebSocket          | NDJSON events (5 types)                              |
| Speaker labels       | App → Core       | REST                             | `POST /meetings/{id}/speaker-labels`                 |
| Signed URL           | App → Core → GCS | REST → GCS signed URL            | `GET /meetings/{id}/audio-url`                       |
| Post-meeting trigger | Core → ML        | Pub/Sub                          | `TranscriptionJob`, `MeetingSessionTerminated`       |
| Synthesis write-back | ML → Core        | REST                             | `PUT /synthesis`, `PATCH /meetings`, `PUT /segments` |


# User Flow (/docs/work/delivered/live-capture/03-tdd/user-flow)


{/* LLM-Context: TL;DR:
  User flow for the Meeting Recording bet. Formerly the "Design" phase.
  Contains 12 user stories with acceptance criteria from the original specification.
  Screen inventory: Recording Controls (extend Meeting Detail), Live Recording View (new),
  Meeting Playback (extend Meeting Detail).
  Key IA insight: live and playback views share the transcript — layout adapts by state.
  */}

# User Flow [#user-flow]

This page maps the user journey for the Meeting Recording bet. The user stories and acceptance criteria here directly determine what data each screen needs — which in turn determines the API endpoints and their shapes.

***

## User Stories [#user-stories]

### Story 1: Start a Live Recording [#story-1-start-a-live-recording]

> As a **WordLoop user**, I want to **press a single button in the app to start recording a meeting** so that **I don't need any external tools to capture what's being said**.

**Acceptance Criteria:**

* Given I am on a meeting view, when I press "Record", the system begins a live recording session for that meeting
* The app begins capturing audio from the device microphone and streaming it to the server
* The server confirms the session has started
* A recording indicator is visible in the UI for the duration of the session
* If I already have an active recording session, the system prevents starting a second one and shows an error

***

### Story 2: Watch Live Transcription [#story-2-watch-live-transcription]

> As a **WordLoop user**, I want to **see what's being said in real time while recording** so that **I can follow along and verify the system is capturing correctly**.

**Acceptance Criteria:**

* Given a recording is active, when the system produces a transcript segment, it appears in the UI
* Interim segments appear immediately and are visually distinct from final segments
* Final segments replace their corresponding interim segments
* Transcript segments scroll automatically to keep the latest content visible
* Latency from audio capture to text on screen is \< 2 seconds under normal conditions

***

### Story 3: Watch Live Talking Points [#story-3-watch-live-talking-points]

> As a **WordLoop user**, I want to **see the current talking point as the meeting progresses** so that **I have a structured summary building up in real time**.

**Acceptance Criteria:**

* Given a recording is active, when the system detects a new or updated talking point, it appears in the UI
* Talking points created during a live session are marked as **draft**
* Talking points appear as a scrollable list alongside the transcript

***

### Story 4: Watch Live Tasks [#story-4-watch-live-tasks]

> As a **WordLoop user**, I want to **see tasks extracted from the conversation in real time** so that **action items aren't lost**.

**Acceptance Criteria:**

* Given a recording is active, when the system detects an action item, a task is created and appears in the UI
* System-extracted tasks are visually distinguishable from user-created tasks via a `source` indicator

***

### Story 5: Add Tasks During Recording [#story-5-add-tasks-during-recording]

> As a **WordLoop user**, I want to **manually add my own tasks while a meeting is being recorded** so that **I can capture action items the AI might miss**.

**Acceptance Criteria:**

* Given a recording is active, a task input field is available in the meeting view
* When I submit a task, it appears immediately in the task list (optimistic update)
* User-created tasks are tagged with `source: user` to distinguish them from system-extracted tasks

***

### Story 6: Stop Recording and Generate Summary [#story-6-stop-recording-and-generate-summary]

> As a **WordLoop user**, I want to **stop the recording and receive a complete meeting summary** so that **I have a structured artefact of the meeting**.

**Acceptance Criteria:**

* Given a recording is active, when I press "Stop", the system ends the recording session
* The system **automatically** generates a headline for the meeting
* The system **automatically** generates a summary and topics (with best-effort talking-point nesting)
* The UI transitions from the live recording view to the standard meeting detail view
* All generated content is visible when the meeting detail loads

***

### Story 7: Automatic Post-Meeting Re-Generation [#story-7-automatic-post-meeting-re-generation]

> As the **WordLoop system**, I want to **automatically re-process the meeting after recording ends** so that **the transcription and synthesis are as accurate as possible**.

**Acceptance Criteria:**

* Given a recording has stopped and the audio has been stored, the system automatically triggers a full offline re-processing job
* The system re-transcribes the full audio using the offline pipeline for higher accuracy
* The system re-generates talking points, topics, summary, and headline from the improved transcript
* Talking points and topics are promoted from **draft** to **final**
* **Tasks are NOT re-generated** — both user-created and system-extracted tasks from the live session are preserved
* The UI updates in real time as each re-generated artefact completes

***

### Story 8: Play Back Meeting Audio [#story-8-play-back-meeting-audio]

> As a **WordLoop user**, I want to **play back the recorded audio after a meeting ends** so that **I can revisit specific moments I may have missed**.

**Acceptance Criteria:**

* Given a meeting has a completed recording, an audio player is available on the meeting detail view
* The player supports play, pause, seek (scrubbing), and playback speed control (0.5×, 1×, 1.5×, 2×)
* Audio is streamed — the full file does not need to download before playback begins
* The player displays the current playback position and total duration

***

### Story 9: Synchronised Transcript Highlighting [#story-9-synchronised-transcript-highlighting]

> As a **WordLoop user**, I want to **see the transcript highlight in sync with audio playback** so that **I can follow along with what's being said**.

**Acceptance Criteria:**

* Given audio is playing, the transcript segment matching the current playback time is visually highlighted
* The transcript auto-scrolls to keep the highlighted segment visible
* When I click a transcript segment, the audio player seeks to that segment's start time and begins playing

***

### Story 10: Live Speaker Identification [#story-10-live-speaker-identification]

> As a **WordLoop user**, I want to **see who is speaking during a live recording** so that &#x2A;*the transcript is attributed to the correct person, not just "Speaker A"**.

**Acceptance Criteria:**

* Given a recording is active and speaker voice profiles have been enrolled, the system matches speaker voice embeddings against known person profiles in near real-time
* When a match exceeds the confidence threshold, the transcript segment shows the resolved person's name instead of the raw speaker label
* Unmatched speakers continue to display their raw speaker label (e.g., "Speaker A")
* Speaker resolution happens incrementally — early segments may remain unresolved and get updated as more audio is processed

***

### Story 11: Graceful Degradation During Recording [#story-11-graceful-degradation-during-recording]

> As a **WordLoop user**, I want to **keep recording even if AI services are temporarily unavailable** so that **I don't lose the meeting**.

**Acceptance Criteria:**

* Given a recording is active and the ML service becomes unavailable, the system continues capturing and storing audio
* The UI displays a clear message indicating that audio is being captured and transcription/insights will be generated when services recover
* Recovery is fully automatic — no user intervention required

***

### Story 12: Recording Duration Limit [#story-12-recording-duration-limit]

> As a **WordLoop user**, I want to **be warned when I'm approaching the recording time limit** so that **I can wrap up the meeting before it's cut off**.

**Acceptance Criteria:**

* Given a recording has been active for a configurable duration (default: 4 hours), the system automatically stops the recording
* A warning is shown to the user at a configurable interval before the limit (e.g., 10 minutes before)
* When the recording is auto-stopped, the standard post-meeting generation and reprocessing pipeline runs automatically

***

## Screen Inventory [#screen-inventory]

| Screen                              | Status                           | Data needed                                                                                                                                       | Actions                                      |
| ----------------------------------- | -------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------- | -------------------------------------------- |
| Meeting Detail — Recording Controls | Extend existing                  | `meeting.source_type`, `meeting_audio_files.status`                                                                                               | Press Record, Press Stop                     |
| Live Recording View                 | Extend Meeting Detail            | `TranscriptSegmentEvent` (WS), `TalkingPointEvent` (WS), `EntityChanged { entity: task }` (WS), `RecordingStartedEvent`, `RecordingDegradedEvent` | Add task (optimistic), Stop recording        |
| Meeting Detail — Playback           | Extend existing (post-recording) | `GET /meetings/{id}/audio-url`, transcript segments with `start_ms`/`end_ms`, `person_id`                                                         | Play/pause/seek/speed, click segment to seek |

***

## Information Architecture [#information-architecture]

### Live Recording View [#live-recording-view]

```
[Meeting Detail — Recording Active]
  ├── Recording Indicator (pulsing, top banner)
  │     elapsed time | Degraded mode warning (conditional)
  ├── Main Area (two columns)
  │     Left: Live Transcript (auto-scroll)
  │             [interim segment — visually muted]
  │             [final segment — speaker label or person name]
  │     Right: Live Sidebar
  │             ├── Talking Points (draft badge)
  │             │     scrollable list, latest at top
  │             └── Tasks
  │                   task input field
  │                   scrollable list (user vs system indicator)
  └── [Stop Recording] button
```

### Meeting Playback View [#meeting-playback-view]

```
[Meeting Detail — After Recording]
  ├── Audio Player (fixed header)
  │     ◀ 15s | ▶ Play | ▶▶ 15s | 01:23 / 42:17 | speed | scrub bar
  ├── Main Area (two columns)
  │     Left: Transcript
  │             [segment — highlighted when current, click to seek]
  │             speaker label or resolved person name per segment
  │     Right: Post-Meeting Sidebar
  │             ├── Talking Points (draft → final badge)
  │             ├── Topics
  │             └── Tasks
```


# Audio (/docs/work/meeting-recording/tdd/contracts/audio)


# Audio [#audio]

Audio chunks flow from the browser through Core to ML as binary WebSocket frames. This page covers the frame formats for both hops, chunk-based GCS storage, ML acknowledgement, and backpressure signalling. For the recording lifecycle (start/stop/resume commands and events), see [Recording](recording). For shared connection semantics, see [Infrastructure](infrastructure).

## Browser → Core: Binary Audio Frame [#browser--core-binary-audio-frame]

Audio chunks are sent as binary WebSocket frames using a length-prefixed metadata envelope followed by raw audio bytes.

```text
uint32_be metadata_length
utf8_json metadata
raw_audio_bytes
```

Metadata schema:

```json
{
  "type": "com.wordloop.recording.audio_chunk.v1",
  "id": "chunk-event-uuid",
  "traceparent": "00-...",
  "meeting_id": "meeting-uuid",
  "sequence": 1842,
  "started_at_ms": 184200,
  "duration_ms": 100,
  "mime_type": "audio/webm",
  "crc32": "hex-encoded-crc32"
}
```

Core verifies the CRC32 checksum, stores the chunk by sequence number in GCS, enriches the metadata with `ml_session_id`, forwards the frame to ML over the ML WebSocket, and records the highest contiguous sequence. Duplicate sequences are acknowledged but not re-stored.

## Chunk-Based GCS Storage [#chunk-based-gcs-storage]

Each audio chunk is stored as a separate GCS object keyed by sequence number: `meetings/{id}/chunks/{seq:08d}.webm`. WebM encodes its EBML header in the first chunk; subsequent chunks contain raw Cluster data. This structure enables gap recovery — any chunk missed due to a connectivity failure can be backfilled from OPFS by sequence number. At session end, Core composes the chunk objects into the final `audio.webm` using GCS Compose — hierarchically in groups of ≤32 for recordings that exceed GCS's 32-object compose limit.

## OPFS Shadow Buffer [#opfs-shadow-buffer]

Every audio chunk is simultaneously written to an always-on shadow buffer maintained by a dedicated Web Worker using the Origin Private File System (OPFS) `createSyncAccessHandle()` API. Each chunk carries a monotonically incrementing sequence number assigned in the browser. This buffer runs unconditionally — it captures audio regardless of Core or GCS connectivity. It is cleared only after Core confirms all chunks are safely in GCS.

### OPFS Chunk Storage Format [#opfs-chunk-storage-format]

Each chunk is stored in OPFS with an integrity envelope so corrupted chunks can be detected during gap recovery:

```text
uint32_be crc32
uint32_be audio_length
raw_audio_bytes
```

The CRC32 is computed over the raw audio bytes. On read (during gap recovery), the reader verifies the CRC32 before uploading. Chunks that fail verification are skipped — the post-meeting batch transcription will handle any resulting audio gaps.

## Core → ML: Binary Audio Frame [#core--ml-binary-audio-frame]

Core enriches the browser's binary frame with `ml_session_id` before forwarding to ML. The binary framing structure is identical (length-prefixed metadata + raw audio), but the metadata schema differs from the browser→Core frame.

```text
uint32_be metadata_length
utf8_json metadata
raw_audio_bytes
```

Metadata schema:

```json
{
  "type": "com.wordloop.ml.audio_chunk.v1",
  "id": "chunk-event-uuid",
  "traceparent": "00-...",
  "meeting_id": "meeting-uuid",
  "ml_session_id": "ml-session-uuid",
  "sequence": 1842,
  "started_at_ms": 184200,
  "duration_ms": 100,
  "mime_type": "audio/webm",
  "crc32": "hex-encoded-crc32"
}
```

ML acknowledges processed audio progress through `AudioChunkAckEvent`, not per-frame WebSocket acks. This avoids chatty acknowledgements while still letting Core detect lag.

## ML → Core: `AudioChunkAckEvent` [#ml--core-audiochunkackevent]

Reports processed audio progress. Core uses this for diagnostics and backpressure decisions.

```json
{
  "specversion": "1.0",
  "id": "event-uuid",
  "source": "wordloop-ml/ws",
  "type": "com.wordloop.ml.audio_chunk.ack.v1",
  "time": "2026-05-01T09:03:05Z",
  "traceparent": "00-...",
  "data": {
    "meeting_id": "meeting-uuid",
    "last_sequence_received": 1842,
    "last_sequence_processed": 1841
  }
}
```

***

## Backpressure [#backpressure]

### ML → Core: `BackpressureEvent` [#ml--core-backpressureevent]

Tells Core that ML is falling behind. Core continues storing audio to GCS and may degrade live insights while preserving the recording.

```json
{
  "specversion": "1.0",
  "id": "event-uuid",
  "source": "wordloop-ml/ws",
  "type": "com.wordloop.ml.backpressure.v1",
  "time": "2026-05-01T09:05:00Z",
  "traceparent": "00-...",
  "data": {
    "meeting_id": "meeting-uuid",
    "reason": "provider_latency",
    "retry_after_ms": 1000,
    "queue_depth": 128
  }
}
```

### ML → Core: `BackpressureClearedEvent` — **New** [#ml--core-backpressureclearedevent--new]

Explicitly signals that ML has recovered from backpressure. Without this, Core must infer recovery from the absence of further `BackpressureEvent` messages or from `AudioChunkAckEvent` progress, which makes Core's state machine ambiguous.

```json
{
  "specversion": "1.0",
  "id": "event-uuid",
  "source": "wordloop-ml/ws",
  "type": "com.wordloop.ml.backpressure_cleared.v1",
  "time": "2026-05-01T09:05:30Z",
  "traceparent": "00-...",
  "data": {
    "meeting_id": "meeting-uuid",
    "queue_depth": 0
  }
}
```

***

## Client-Side Backpressure [#client-side-backpressure]

Core does not send an explicit backpressure event to the browser. Instead, the client monitors `WebSocket.bufferedAmount` on the Core-facing connection. If `bufferedAmount` exceeds a configurable threshold (default: 5 MB), the client pauses `MediaRecorder` output and queues chunks in the OPFS shadow buffer only. When `bufferedAmount` drops below the resume threshold (default: 1 MB), the client resumes sending. This uses the browser's native WebSocket flow control rather than adding a custom protocol-level backpressure mechanism.


# Contracts (/docs/work/meeting-recording/tdd/contracts)


# Contracts [#contracts]

These contracts define the complete API surface for the Meeting Recording bet. Each page covers one entity end-to-end: its resource shape, REST operations, WebSocket events, ML integration, and Pub/Sub triggers — so you can understand how a single concept works across all protocols without jumping between files.

For shared concerns that apply across all entities — connection semantics, authentication, error format, CloudEvents envelope, Pub/Sub configuration, and failure modes — see [Infrastructure](./infrastructure).

**Specification alignment:** existing machine-readable specs live in `specs/core-openapi.json`, `specs/ml-openapi.json`, `specs/core-asyncapi-ws.yaml`, and `specs/core-asyncapi-pubsub.yaml`. These contract pages describe the ideal-state surface — both existing endpoints and new additions for live recording. New endpoints and fields introduced by this bet are marked with **New**.

***

## Entity Pages [#entity-pages]

| Entity                           | What it covers                                                                                                  |
| -------------------------------- | --------------------------------------------------------------------------------------------------------------- |
| [Meeting](./meeting)             | Top-level resource. CRUD, expand parameter, speaker-label assignment, audio playback URL.                       |
| [Recording](./recording)         | Live recording lifecycle. WebSocket commands and events, ML session management, OPFS gap repair, Pub/Sub drain. |
| [Audio](./audio)                 | Binary audio transport. Frame formats (browser→Core, Core→ML), chunk storage, acknowledgement, backpressure.    |
| [Transcription](./transcription) | Transcript segments. CRUD, live streaming events, batch processing, ML write-back, Pub/Sub trigger.             |
| [Synthesis](./synthesis)         | ML-generated artefacts. Summary, topics, talking points. Read endpoints and ML write-back.                      |
| [Task](./task)                   | Action items. Full CRUD with sub-task nesting. User-created and system-generated (ML).                          |
| [Person](./person)               | Speaker identity. CRUD, speaker identification pipeline, voice profiles, ML matching events.                    |

***

## ML → Core Write-Back Summary [#ml--core-write-back-summary]

ML writes durable meeting artefacts to Core REST, not to the browser and not directly to Core's database. Each write-back endpoint is documented on its entity page.

| ML output                    | Core REST target                     | Entity page                      |
| ---------------------------- | ------------------------------------ | -------------------------------- |
| Headline                     | `PATCH /meetings/{id}`               | [Meeting](./meeting)             |
| Live transcript append       | `POST /transcriptions/{id}/segments` | [Transcription](./transcription) |
| Batch transcript replacement | `PUT /transcriptions/{id}/segments`  | [Transcription](./transcription) |
| Transcription lifecycle      | `PATCH /transcriptions/{id}/status`  | [Transcription](./transcription) |
| Live talking point           | `POST /meetings/{id}/talking-points` | [Synthesis](./synthesis)         |
| Final synthesis              | `PUT /meetings/{id}/synthesis`       | [Synthesis](./synthesis)         |
| System-generated tasks       | `POST /tasks`                        | [Task](./task)                   |

***

## Open Problem: ML→Core Write-Back Resilience [#open-problem-mlcore-write-back-resilience]

If Core is unavailable when ML finishes post-meeting processing, ML cannot deliver results (transcript segments, synthesis, status updates). Today, ML retries with exponential backoff against Core REST. If Core remains down beyond the retry budget, the results are lost.

This is documented as a separate [problem statement](/docs/work/problem-statements/ml-writeback-resilience) for a future bet. Potential approaches include ML publishing results to Pub/Sub as a durable fallback, or persisting results to its own store for later delivery.


# Infrastructure (/docs/work/meeting-recording/tdd/contracts/infrastructure)


# Infrastructure [#infrastructure]

Shared semantics that apply across all entity contracts. Each entity page ([Meeting](meeting), [Recording](recording), [Audio](audio), [Transcription](transcription), [Synthesis](synthesis), [Task](task), [Person](person)) documents its own endpoints and events but relies on the conventions defined here.

***

## Core REST Semantics [#core-rest-semantics]

| Concern          | Contract                                                                                                                                                                                                                                                                                                                                              |
| ---------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| Auth             | User-facing calls require `bearerAuth` (Clerk JWT). ML write-back uses service-to-service auth (signed JWT or mTLS).                                                                                                                                                                                                                                  |
| User scoping     | All resources are implicitly scoped to the authenticated user's `sub` claim. Queries return only that user's data; mutations on another user's resource return `403`. `user_id` never appears in request or response bodies — it is derived from the token. Service-auth calls include `user_id` in the request body when acting on behalf of a user. |
| Trace context    | All requests accept `traceparent` and `tracestate` headers. Core propagates trace context into WebSocket and Pub/Sub envelopes.                                                                                                                                                                                                                       |
| Idempotency      | All `POST` requests require `Idempotency-Key: <uuid>`. Retried requests return the original result with the same status code.                                                                                                                                                                                                                         |
| Echo suppression | User mutations accept `Client-Session-Id: <uuid>`. WebSocket echoes caused by that client carry `sourceClientId` so the origin tab can discard them.                                                                                                                                                                                                  |
| Errors           | All errors return `application/problem+json` with RFC 9457 fields: `type`, `title`, `status`, `detail`, `instance`, and optional field-level `errors[]`.                                                                                                                                                                                              |
| Pagination       | Cursor-based. Requests accept `cursor` and `limit` (default 20, max 100). Responses include `next_cursor`.                                                                                                                                                                                                                                            |
| Location header  | All `201 Created` responses include a `Location` header pointing to the new resource.                                                                                                                                                                                                                                                                 |
| Rate limits      | User-facing responses include `RateLimit-Limit`, `RateLimit-Remaining`, and `RateLimit-Reset`.                                                                                                                                                                                                                                                        |

***

## ML REST Semantics [#ml-rest-semantics]

| Concern            | Contract                                                                                                                                    |
| ------------------ | ------------------------------------------------------------------------------------------------------------------------------------------- |
| Auth               | Service-to-service auth only. Core is the normal caller. Browser credentials are never accepted.                                            |
| Trace context      | `traceparent` and `tracestate` accepted on every request and copied into downstream provider calls (AssemblyAI, OpenAI).                    |
| Idempotency        | Creating or draining sessions requires `Idempotency-Key: <uuid>`.                                                                           |
| Errors             | `application/problem+json` with RFC 9457 fields. Validation errors include field-level `errors[]`.                                          |
| Timeouts           | Session create: 20 seconds. Drain: 30 seconds before returning `202 Accepted`. Voice operations: 30 seconds.                                |
| PII/audio handling | Raw audio is not persisted by ML unless explicitly part of a voice-profile enrichment operation. Live audio durability belongs to Core/GCS. |
| Location header    | All `201 Created` responses include a `Location` header.                                                                                    |

***

## Browser WebSocket Connection [#browser-websocket-connection]

Core owns the only browser-facing WebSocket.

| Concern         | Contract                                                                                                                                                                                                                                                      |
| --------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| Endpoint        | `GET /ws` upgrade                                                                                                                                                                                                                                             |
| Auth            | `token=<jwt>` query parameter, or `Authorization: Bearer <jwt>` when the edge supports forwarding headers                                                                                                                                                     |
| Client identity | `client_session_id=<uuid>` query parameter; copied into `sourceClientId` on echoes caused by that client                                                                                                                                                      |
| Replay cursor   | `last_event_id=<uuid>` optional. Core replays durable entity-change events after this cursor when the replay buffer has not expired.                                                                                                                          |
| Replay buffer   | Core retains the last **5 minutes** of durable events per user. If the client reconnects after the buffer has expired, it must do a full state re-fetch via REST. Core signals this by sending a `ReplayExpiredEvent` instead of replaying events.            |
| Message size    | JSON text frames: 64 KiB. Binary audio frames: 1 MiB.                                                                                                                                                                                                         |
| Keepalive       | Native WebSocket ping every 30 seconds; two missed pongs terminate the connection.                                                                                                                                                                            |
| Load balancing  | Live recording requires affinity to one Core pod for the life of the socket. The ideal edge is Layer 4 or an equivalent connection-stable route. This is a known constraint — see the [backplane problem statement](/docs/work/problem-statements/backplane). |

### Real-Time Pattern [#real-time-pattern]

**Optimistic Mutation + Echo-Suppressed Streaming** for entity mutations. **Bidirectional recording streaming** for audio. REST remains the source of truth for writes; WebSocket events keep every open tab in sync and deliver low-latency recording artefacts.

### CloudEvents Envelope [#cloudevents-envelope]

Every text frame is a CloudEvents v1.0 structured JSON event.

```json
{
  "specversion": "1.0",
  "id": "event-uuid",
  "source": "wordloop-core/ws",
  "type": "com.wordloop.entity.changed.v1",
  "time": "2026-05-01T09:00:00Z",
  "traceparent": "00-...",
  "tracestate": "vendor=value",
  "sourceClientId": "client-session-uuid",
  "data": {}
}
```

`sourceClientId` is present only when the event was caused by a specific UI session. The origin client discards matching echoes; other tabs and devices apply the event.

***

## ML WebSocket Connection [#ml-websocket-connection]

Core opens one WebSocket per ML live session after `POST /meetings/{id}/live-session` returns `websocket_url`. The browser never connects to ML directly — Core bridges the ML WebSocket to the browser WebSocket.

| Concern       | Contract                                                                                                                                                    |
| ------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------- |
| Endpoint      | `GET /meetings/{meeting_id}/live-session/stream` upgrade                                                                                                    |
| Caller        | Core only                                                                                                                                                   |
| Auth          | Service bearer token or mTLS identity. Browser credentials are never accepted.                                                                              |
| Trace context | Initial handshake includes `traceparent`; every CloudEvent also carries `traceparent`.                                                                      |
| Replay cursor | Core may reconnect with `last_ml_event_id` and `last_audio_sequence`. ML de-duplicates audio by sequence and resumes output after the cursor when possible. |
| Keepalive     | Native WebSocket ping every 30 seconds. Either side may close with code `1012` for service restart.                                                         |

### Text Frame Envelope [#text-frame-envelope]

All text frames are CloudEvents v1.0 structured JSON.

```json
{
  "specversion": "1.0",
  "id": "event-uuid",
  "source": "wordloop-ml/ws",
  "type": "com.wordloop.ml.transcript.segment.v1",
  "time": "2026-05-01T09:03:05Z",
  "traceparent": "00-...",
  "data": {}
}
```

***

## Cache Invalidation [#cache-invalidation]

### `EntityChangedEvent` [#entitychangedevent]

Generic cache-invalidation signal for single-entity mutations. For bulk operations (transcript replacement, synthesis update), entity pages define specific event types.

```json
{
  "specversion": "1.0",
  "id": "event-uuid",
  "source": "wordloop-core/ws",
  "type": "com.wordloop.entity.changed.v1",
  "time": "2026-05-01T09:05:00Z",
  "traceparent": "00-...",
  "sourceClientId": "client-session-uuid",
  "data": {
    "entity": "meeting",
    "action": "updated",
    "id": "meeting-uuid",
    "version": 42
  }
}
```

Valid entities: `meeting`, `person`, `task`, `note`, `transcription`, `transcript_segment`, `talking_point`, `synthesis`, `speaker_state`.

Valid actions: `created`, `updated`, `deleted`.

### `ReplayExpiredEvent` — **New** [#replayexpiredevent--new]

Sent when the client reconnects with a `last_event_id` that is older than the replay buffer (5 minutes). The client must do a full state re-fetch via REST.

```json
{
  "specversion": "1.0",
  "id": "event-uuid",
  "source": "wordloop-core/ws",
  "type": "com.wordloop.replay.expired.v1",
  "time": "2026-05-01T09:35:00Z",
  "traceparent": "00-...",
  "data": {
    "last_event_id": "stale-event-uuid",
    "buffer_ttl_seconds": 300,
    "message": "Replay buffer expired. Full state re-fetch required."
  }
}
```

***

## Browser Reconnection Rules [#browser-reconnection-rules]

| Scenario                                | Contract                                                                                                                                                                         |
| --------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| Browser loses socket (\< 5 min)         | App reconnects with `last_event_id`. Core replays buffered events. If a recording is active, app also sends `ResumeRecordingCommand`.                                            |
| Browser loses socket (> 5 min)          | Core sends `ReplayExpiredEvent`. App does a full REST re-fetch. If a recording is active, app sends `ResumeRecordingCommand` for gap recovery.                                   |
| Audio frames duplicated after reconnect | Core de-duplicates by `(meeting_id, sequence)` and checksum.                                                                                                                     |
| ML stream drops but Core socket remains | Core emits `RecordingErrorEvent { code: "ml_unavailable", severity: "degraded" }`; audio still writes to GCS. On recovery, emits `RecordingErrorEvent { code: "ml_recovered" }`. |
| Core drains for deploy                  | Core sends `RecordingErrorEvent { code: "backpressure" }` or closes after ping timeout; OPFS gap repair restores missing chunks on reconnect.                                    |

***

## Pub/Sub Semantics [#pubsub-semantics]

Pub/Sub is for durable asynchronous work — not the live path. Live audio and ML outputs use WebSockets. Pub/Sub coordinates post-meeting processing, session termination/drain, and retryable background jobs. Individual topics are documented on their entity pages ([Transcription](transcription), [Recording](recording)).

All Pub/Sub payloads are CloudEvents v1.0 JSON.

```json
{
  "specversion": "1.0",
  "id": "event-uuid",
  "source": "wordloop-core/pubsub",
  "type": "com.wordloop.transcription.requested.v1",
  "time": "2026-05-01T10:00:00Z",
  "traceparent": "00-...",
  "tracestate": "vendor=value",
  "data": {}
}
```

| Concern      | Contract                                                                                                                                                                                                                                                 |
| ------------ | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| Delivery     | At least once. Consumers must de-duplicate by CloudEvents `id` and business idempotency keys.                                                                                                                                                            |
| Ordering key | `meeting_id` for all topics. Ensures events for the same meeting are processed in order within a single subscriber.                                                                                                                                      |
| Publishing   | Core publishes through a transactional outbox — the event is written to an outbox table within the same database transaction as the state change, then delivered by a background relay. This guarantees at-least-once delivery without two-phase commit. |
| Traceability | `traceparent` is required whenever the originating HTTP/WebSocket request carried one.                                                                                                                                                                   |

### Dead-Letter and Retry Configuration [#dead-letter-and-retry-configuration]

| Setting                  | Value                                   | Rationale                                         |
| ------------------------ | --------------------------------------- | ------------------------------------------------- |
| Max delivery attempts    | 10                                      | Covers transient failures without infinite retry. |
| Initial backoff          | 1 second                                | Fast retry for network blips.                     |
| Max backoff              | 600 seconds (10 min)                    | Caps exponential growth.                          |
| Backoff multiplier       | 2                                       | Standard exponential.                             |
| Dead-letter topic suffix | `-dlq` (e.g., `transcription-jobs-dlq`) | One DLQ per source topic.                         |
| DLQ retention            | 14 days                                 | Enough time for manual investigation and replay.  |
| Ack deadline             | 600 seconds                             | Long enough for batch transcription jobs.         |

When a message exhausts its retry budget, Pub/Sub forwards it to the dead-letter topic. The DLQ subscription has no automatic consumers — an operator (or future automated triage) reviews and replays failed messages.

***

## ML Stream Health [#ml-stream-health]

### `StreamWarningEvent` [#streamwarningevent]

Reports recoverable ML-side degradation that doesn't rise to backpressure. For audio-specific backpressure events, see [Audio](audio).

```json
{
  "specversion": "1.0",
  "id": "event-uuid",
  "source": "wordloop-ml/ws",
  "type": "com.wordloop.ml.stream.warning.v1",
  "time": "2026-05-01T09:06:00Z",
  "traceparent": "00-...",
  "data": {
    "meeting_id": "meeting-uuid",
    "code": "insight_warning",
    "message": "Talking points are delayed; transcription continues."
  }
}
```

***

## ML Failure Semantics [#ml-failure-semantics]

| Failure                                    | Contract                                                                                                                                                                                                                                                                                                                                                                                   |
| ------------------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| ML WebSocket disconnects                   | Core reconnects with `last_audio_sequence` and `last_ml_event_id`. ML de-duplicates audio and resumes output when possible. Core sends `StreamStartEvent` with current speaker states and voice profiles on every reconnect.                                                                                                                                                               |
| ML cannot reconnect                        | Core continues browser audio capture and GCS chunk storage, then emits Core `RecordingErrorEvent { code: "ml_unavailable" }`.                                                                                                                                                                                                                                                              |
| Upstream transcription provider slows down | ML emits `BackpressureEvent`; Core preserves audio and may pause live insights. ML emits `BackpressureClearedEvent` on recovery.                                                                                                                                                                                                                                                           |
| Speaker state changes while disconnected   | Core persists the state to the database. On reconnect, Core sends `StreamStartEvent` with all current speaker states — ML reconstructs its in-memory map without needing a pull endpoint.                                                                                                                                                                                                  |
| ML pod restarts mid-session                | Core detects the WebSocket drop and reconnects (possibly to a new pod). `StreamStartEvent` includes speaker states and voice profiles. ML fetches recent transcript segments from `GET /transcriptions/{id}/segments?after_ms=...` to rebuild its LLM context window, then resumes processing. Context quality degrades gracefully — the rolling buffer rebuilds over subsequent segments. |
| Drain exceeds budget                       | ML returns REST `202 Accepted` status and later emits write-back results via Core REST as background completion finishes.                                                                                                                                                                                                                                                                  |

***

## Event Versioning Policy [#event-versioning-policy]

All CloudEvents types use a `.v1` suffix (e.g., `com.wordloop.recording.start.v1`). The versioning policy:

* **Additive changes** (new optional fields, new event types) do not require a version bump. Consumers must ignore unknown fields.
* **Breaking changes** (removed fields, changed semantics, changed required fields) require a new version suffix (`.v2`). The old type continues to be emitted alongside the new type for one release cycle to allow consumer migration.
* **Deprecation**: A deprecated event type is annotated in the contract docs but continues to fire until all known consumers have migrated.

Consumers should be written defensively: parse known fields, ignore unknown fields, and tolerate missing optional fields.

***

## Observability Conventions [#observability-conventions]

Every service must include the following fields in structured log output for any operation related to a live recording session:

| Field              | When present            | Source                                     |
| ------------------ | ----------------------- | ------------------------------------------ |
| `meeting_id`       | Always                  | From the request or event                  |
| `ml_session_id`    | During active recording | From `RecordingStartedEvent` or ML session |
| `sequence`         | Audio chunk operations  | From the chunk metadata                    |
| `transcription_id` | Transcript operations   | From the transcription resource            |
| `traceparent`      | Always                  | From the incoming request/event            |

These fields enable correlation of a single audio chunk or transcript segment across App → Core → ML → AssemblyAI → ML → Core → App, plus GCS writes and Pub/Sub messages.

***

## Recording Event History [#recording-event-history]

Core persists a `recording_event_history` table that logs every recording state transition and significant event:

| Column        | Type        | Description                                                                               |
| ------------- | ----------- | ----------------------------------------------------------------------------------------- |
| `id`          | UUID        | Event ID                                                                                  |
| `meeting_id`  | UUID        | Meeting reference                                                                         |
| `event_type`  | text        | e.g., `started`, `stopped`, `error`, `gap_upload`, `compose_started`, `compose_completed` |
| `from_status` | text        | Previous recording status (nullable for initial events)                                   |
| `to_status`   | text        | New recording status                                                                      |
| `metadata`    | jsonb       | Event-specific data (error codes, sequence numbers, chunk counts)                         |
| `created_at`  | timestamptz | When the event occurred                                                                   |

This table is write-only during normal operation. It is the primary diagnostic tool for investigating recording issues in production.


# Meeting (/docs/work/meeting-recording/tdd/contracts/meeting)


# Meeting [#meeting]

A meeting is the top-level entity. It represents a conversation — whether captured live, uploaded as a file, or created as ad-hoc notes. A meeting owns its transcription, synthesis, tasks, and audio. Recording is available via `?expand=recording` for meetings that have one. For shared concerns that apply across all entities — authentication, error format, idempotency, echo suppression — see [Infrastructure](infrastructure).

For the full recording lifecycle — commands, events, binary audio, ML session, and gap repair — see [Recording](recording).

***

## Resource Shape [#resource-shape]

```json
{
  "id": "meeting-uuid",
  "title": "Weekly Product Review",
  "headline": "Rollout plan review",
  "source_type": "live",
  "start_time": "2026-05-01T09:00:00Z",
  "end_time": "2026-05-01T10:00:00Z",
  "created_at": "2026-05-01T08:59:00Z",
  "attendees": [],
  "notes": "## Action Items\n- Follow up with design team\n- Review rollout plan",
  "transcription": {
    "id": "transcription-uuid",
    "status": "completed"
  },
  "synthesis": {
    "summary": "The team aligned on rollout sequencing.",
    "topics": [],
    "talking_points": []
  }
}
```

`headline` is auto-generated by ML from the meeting content and present on all meetings regardless of source type. It is included in the compact list shape so it can be displayed when listing meetings. ML writes it via `PATCH /meetings/{id}` with service auth.

**Expand parameter:** `GET /meetings/{id}` supports `?expand=transcription,synthesis,tasks,attendees,recording` to control which nested resources are included. Without expansion, only the top-level fields and summary references (e.g., `transcription.id`, `transcription.status`) are returned. `recording` is expand-only — it is never present in the default or compact shapes. List endpoints (`GET /meetings`) always return the compact form.

**Expanded: recording** — when `?expand=recording` is included and the meeting has a recording:

```json
{
  "recording": {
    "status": "completed",
    "started_at": "2026-05-01T09:00:00Z",
    "stopped_at": "2026-05-01T10:00:00Z",
    "stop_reason": "user_requested",
    "last_received_sequence": 36000,
    "audio_available": true
  }
}
```

If the meeting has no recording, the `recording` field is `null` even when expanded.

Valid `source_type` values: `live`, `upload`, `text`, `anecdotal`.

***

## REST API [#rest-api]

### `POST /meetings` [#post-meetings]

|                      |                                                                          |
| -------------------- | ------------------------------------------------------------------------ |
| **Auth**             | `bearerAuth`                                                             |
| **Idempotency**      | Required                                                                 |
| **Echo suppression** | `Client-Session-Id` optional                                             |
| **Response**         | `201 Created` with `Meeting` + `Location: /meetings/{id}`                |
| **Side effects**     | Broadcasts `EntityChangedEvent { entity: "meeting", action: "created" }` |

```json
{
  "title": "Weekly Product Review",
  "source_type": "live",
  "start_time": "2026-05-01T09:00:00Z"
}
```

### `GET /meetings` [#get-meetings]

|                  |                                                                  |
| ---------------- | ---------------------------------------------------------------- |
| **Auth**         | `bearerAuth`                                                     |
| **Response**     | `200 MeetingList`                                                |
| **Query params** | `cursor`, `limit`, `has_active_recording` **New**, `source_type` |

The compact list shape includes: `id`, `title`, `headline`, `source_type`, `start_time`, `end_time`, `created_at`, `attendees` (compact: id + display\_name only), and `transcription` (compact: id + status only). Recording, synthesis, and tasks are never included in the list shape — use the detail endpoint with `?expand` to fetch them. &#x2A;*New:** `has_active_recording=true` filters to meetings with an active live recording — the app uses this as the read-only guard for disabling **Start Live Recording**.

### `GET /meetings/{id}` [#get-meetingsid]

|                  |                                                                                             |
| ---------------- | ------------------------------------------------------------------------------------------- |
| **Auth**         | `bearerAuth`                                                                                |
| **Response**     | `200 Meeting`                                                                               |
| **Query params** | `expand` (comma-separated: `transcription`, `synthesis`, `tasks`, `attendees`, `recording`) |
| **Errors**       | `404` meeting not found                                                                     |

### `PATCH /meetings/{id}` [#patch-meetingsid]

|                      |                                                                          |
| -------------------- | ------------------------------------------------------------------------ |
| **Auth**             | `bearerAuth` or service auth                                             |
| **Echo suppression** | `Client-Session-Id` optional                                             |
| **Response**         | `200 Meeting`                                                            |
| **Side effects**     | Broadcasts `EntityChangedEvent { entity: "meeting", action: "updated" }` |

User update:

```json
{
  "notes": "## Action Items\n- Follow up with design team"
}
```

ML write-back (service auth):

```json
{
  "headline": "Rollout plan review"
}
```

### `DELETE /meetings/{id}` [#delete-meetingsid]

|                  |                                                                          |
| ---------------- | ------------------------------------------------------------------------ |
| **Auth**         | `bearerAuth`                                                             |
| **Response**     | `204 No Content`                                                         |
| **Errors**       | `409` active recording in progress                                       |
| **Side effects** | Broadcasts `EntityChangedEvent { entity: "meeting", action: "deleted" }` |

### `POST /meetings/{id}/speaker-labels` — **New** [#post-meetingsidspeaker-labels--new]

|                      |                                                                                                                                                                                 |
| -------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **Auth**             | `bearerAuth`                                                                                                                                                                    |
| **Idempotency**      | Required                                                                                                                                                                        |
| **Echo suppression** | `Client-Session-Id` optional                                                                                                                                                    |
| **Response**         | `200 SpeakerLabelAssignment`                                                                                                                                                    |
| **Errors**           | `404` meeting not found; `422` person not found; `422` speaker label not present in the meeting                                                                                 |
| **Side effects**     | Broadcasts `EntityChangedEvent { entity: "transcript_segment", action: "updated" }`. If a live session is active, sends `SpeakerStateUpdatedEvent` to ML over the ML WebSocket. |

Request:

```json
{
  "speaker_label": "speaker_1",
  "person_id": "person-uuid"
}
```

Response:

```json
{
  "meeting_id": "meeting-uuid",
  "speaker_label": "speaker_1",
  "person_id": "person-uuid",
  "state": "manual",
  "updated_segment_count": 27
}
```

For how speaker labels feed the identification pipeline, see [Person & Speaker Identity](person).

### `GET /meetings/{id}/audio-url` — **New** [#get-meetingsidaudio-url--new]

|                   |                                                                     |
| ----------------- | ------------------------------------------------------------------- |
| **Auth**          | `bearerAuth`                                                        |
| **Response**      | `200 AudioPlaybackUrl`                                              |
| **Cache-Control** | `private, no-store`                                                 |
| **Errors**        | `404` meeting not found; `404` audio still composing or unavailable |

```json
{
  "url": "https://storage.googleapis.com/signed-url",
  "expires_at": "2026-05-01T10:00:00Z",
  "mime_type": "audio/webm",
  "duration_ms": 3610000
}
```


# Person & Speaker Identity (/docs/work/meeting-recording/tdd/contracts/person)


# Person & Speaker Identity [#person--speaker-identity]

People are speaker identities. They can be referenced by tasks (assignee) and transcript segments (speaker attribution). This page covers person CRUD, the speaker identification pipeline that resolves anonymous diarisation labels to known people, and voice profile management. For shared semantics, see [Infrastructure](infrastructure).

**User-scoped identity:** People are scoped to the authenticated user. Each user maintains their own set of people — there is no cross-user sharing of person records or voice profiles. If User A records a meeting with Person X, and User B later records with the same real-world person, User B must create their own Person record. Voice profile enrichment applies only within the owning user's data. This is a deliberate simplification for v1; organisation-level identity sharing is out of scope.

## Resource Shape [#resource-shape]

```json
{
  "id": "person-uuid",
  "display_name": "Avery Chen",
  "full_name": "Avery Chen",
  "title": "Product Manager",
  "role": "Product",
  "company": "WordLoop",
  "email": "avery@example.com",
  "voice_confidence": 0.91,
  "voice_model_status": "ready",
  "tags": ["team-alpha"],
  "created_at": "2026-04-15T10:00:00Z",
  "updated_at": "2026-05-01T09:15:00Z"
}
```

Valid `voice_model_status` values: `untrained`, `training`, `ready`, `failed`.

## REST API [#rest-api]

### `GET /people` [#get-people]

Lists people for the authenticated user. Used for the speaker-labelling autocomplete.

|                  |                                               |
| ---------------- | --------------------------------------------- |
| **Auth**         | `bearerAuth`                                  |
| **Response**     | `200 PersonList`                              |
| **Query params** | `cursor`, `limit`, `q` (search by name/email) |

### `POST /people` [#post-people]

Creates a person. Used during speaker labelling when the user adds a new person.

|                  |                                                                         |
| ---------------- | ----------------------------------------------------------------------- |
| **Auth**         | `bearerAuth`                                                            |
| **Idempotency**  | Required                                                                |
| **Response**     | `201 Created` with `Person` + `Location: /people/{id}`                  |
| **Side effects** | Broadcasts `EntityChangedEvent { entity: "person", action: "created" }` |

```json
{
  "display_name": "Avery Chen",
  "full_name": "Avery Chen",
  "email": "avery@example.com"
}
```

### `GET /people/{id}` [#get-peopleid]

Returns a single person.

|              |                        |
| ------------ | ---------------------- |
| **Auth**     | `bearerAuth`           |
| **Response** | `200 Person`           |
| **Errors**   | `404` person not found |

### `PATCH /people/{id}` [#patch-peopleid]

Updates person metadata.

|                  |                                                                         |
| ---------------- | ----------------------------------------------------------------------- |
| **Auth**         | `bearerAuth`                                                            |
| **Response**     | `200 Person`                                                            |
| **Side effects** | Broadcasts `EntityChangedEvent { entity: "person", action: "updated" }` |

### `DELETE /people/{id}` [#delete-peopleid]

Deletes a person. Transcript segments retain the `speaker_label` but clear the `person_id`.

|                  |                                                                         |
| ---------------- | ----------------------------------------------------------------------- |
| **Auth**         | `bearerAuth`                                                            |
| **Response**     | `204 No Content`                                                        |
| **Side effects** | Broadcasts `EntityChangedEvent { entity: "person", action: "deleted" }` |

***

## Speaker Identification Pipeline [#speaker-identification-pipeline]

During a live recording, AssemblyAI produces diarised transcript segments with anonymous labels (`speaker_1`, `speaker_2`). ML resolves these to known people through voice embedding comparison. The pipeline has four states:

| State       | Behaviour                                                                                                                                                                                                                          |
| ----------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `unmatched` | Compare this segment's embedding against in-session voice profiles (pushed by Core). If confidence exceeds the threshold → transition to `matched`. Otherwise, increment attempts and retry on the next segment from this speaker. |
| `matched`   | The speaker label is locked to a person. All future segments from this speaker are tagged immediately — no further voice comparison needed.                                                                                        |
| `exhausted` | After N failed attempts (configurable, e.g. 5 segments), stop comparing for this speaker. The raw `speaker_label` is preserved. The user can manually resolve it.                                                                  |
| `manual`    | Set when the user labels a speaker via `POST /meetings/{id}/speaker-labels` (see [Meeting](meeting)). Takes precedence over voice matching — ML will not attempt to match this speaker regardless of voice similarity.             |

Manual speaker labelling is documented on the [Meeting](meeting) page. The REST fallback for pushing speaker state to ML during session recovery is documented on the [Recording](recording) page (`POST /meetings/{id}/live-session/speaker-states`).

***

## ML Integration [#ml-integration]

### Core → ML [#core--ml]

#### WebSocket: `SpeakerStateUpdatedEvent` [#websocket-speakerstateupdatedevent]

Keeps ML aligned with user speaker-label changes during the live session. `manual` state takes precedence over voice matching.

```json
{
  "specversion": "1.0",
  "id": "event-uuid",
  "source": "wordloop-core/ml-ws",
  "type": "com.wordloop.ml.speaker_state.updated.v1",
  "time": "2026-05-01T09:15:00Z",
  "traceparent": "00-...",
  "data": {
    "meeting_id": "meeting-uuid",
    "speaker_label": "speaker_1",
    "state": "manual",
    "person_id": "person-uuid"
  }
}
```

#### WebSocket: `VoiceProfilesUpdatedEvent` [#websocket-voiceprofilesupdatedevent]

Refreshes the in-session voice profile cache when Core enrolls or updates a profile while a recording is active.

```json
{
  "specversion": "1.0",
  "id": "event-uuid",
  "source": "wordloop-core/ml-ws",
  "type": "com.wordloop.ml.voice_profiles.updated.v1",
  "time": "2026-05-01T09:16:00Z",
  "traceparent": "00-...",
  "data": {
    "meeting_id": "meeting-uuid",
    "profiles": [
      {
        "person_id": "person-uuid",
        "embedding_model": "ecapa-tdnn-v1",
        "embedding": [0.12, -0.34]
      }
    ]
  }
}
```

### ML → Core [#ml--core]

#### WebSocket: `SpeakerMatchProducedEvent` [#websocket-speakermatchproducedevent]

Reports a confident speaker-to-person match. Core updates all matching segments and persists `meeting_speaker_states` as `matched`.

```json
{
  "specversion": "1.0",
  "id": "event-uuid",
  "source": "wordloop-ml/ws",
  "type": "com.wordloop.ml.speaker_match.v1",
  "time": "2026-05-01T09:04:00Z",
  "traceparent": "00-...",
  "data": {
    "meeting_id": "meeting-uuid",
    "speaker_label": "speaker_1",
    "person_id": "person-uuid",
    "score": 0.93,
    "threshold": 0.88,
    "state": "matched"
  }
}
```

#### WebSocket: `SpeakerExhaustedEvent` [#websocket-speakerexhaustedevent]

Tells Core that ML has stopped trying to match an unknown speaker after the bounded attempt count.

```json
{
  "specversion": "1.0",
  "id": "event-uuid",
  "source": "wordloop-ml/ws",
  "type": "com.wordloop.ml.speaker_exhausted.v1",
  "time": "2026-05-01T09:08:00Z",
  "traceparent": "00-...",
  "data": {
    "meeting_id": "meeting-uuid",
    "speaker_label": "speaker_2",
    "attempt_count": 5,
    "state": "exhausted"
  }
}
```

***

## Voice Profile Operations [#voice-profile-operations]

Voice profiles power speaker identification. Core stores person records; ML owns embedding extraction and matching semantics.

### `POST /voice-profiles/matches` [#post-voice-profilesmatches]

Compares a speaker embedding against enrolled voice profiles. Core supplies candidate profiles explicitly.

|              |                                                            |
| ------------ | ---------------------------------------------------------- |
| **Auth**     | service auth                                               |
| **Response** | `200 VoiceMatchResponse`                                   |
| **Errors**   | `422` invalid embedding; `503` embedding model unavailable |

**Request:**

```json
{
  "meeting_id": "meeting-uuid",
  "speaker_label": "speaker_1",
  "embedding_model": "ecapa-tdnn-v1",
  "embedding": [0.12, -0.34],
  "candidate_person_ids": ["person-uuid"],
  "top_k": 3
}
```

**Response:**

```json
{
  "matches": [
    {
      "person_id": "person-uuid",
      "score": 0.93,
      "threshold": 0.88,
      "decision": "matched"
    }
  ]
}
```

### `POST /voice-profiles` [#post-voice-profiles]

Creates or enriches a person's voice profile from post-meeting segment embeddings.

|                          |                                                                                         |
| ------------------------ | --------------------------------------------------------------------------------------- |
| **Auth**                 | service auth                                                                            |
| **Idempotency**          | Required                                                                                |
| **Request Content-Type** | `multipart/form-data` for audio samples or `application/json` for segment references    |
| **Response**             | `201 Created` or `200 OK` with `VoiceProfile` + `Location: /voice-profiles/{person_id}` |

```json
{
  "person_id": "person-uuid",
  "meeting_id": "meeting-uuid",
  "segment_ids": ["segment-uuid"],
  "embedding_model": "ecapa-tdnn-v1"
}
```

```json
{
  "person_id": "person-uuid",
  "embedding_model": "ecapa-tdnn-v1",
  "sample_count": 12,
  "quality_score": 0.91,
  "updated_at": "2026-05-01T10:15:00Z"
}
```


# Recording (/docs/work/meeting-recording/tdd/contracts/recording)


# Recording [#recording]

Recording state is a sub-resource of Meeting. This page covers the full lifecycle: starting, stopping, resuming, ML session orchestration, and gap repair. For binary audio frame formats and transport, see [Audio](audio). For shared connection semantics, see [Infrastructure](infrastructure).

**Recording creation:** The recording resource is created as a side effect of `StartRecordingCommand` (see WebSocket commands below). There is no `POST /meetings/{id}/recording` endpoint — the recording lifecycle is entirely driven by WebSocket commands. The REST surface provides read-only access to recording state and chunk management for gap recovery.

## Resource Shape [#resource-shape]

```json
{
  "meeting_id": "meeting-uuid",
  "status": "active",
  "started_at": "2026-05-01T09:00:00Z",
  "stopped_at": null,
  "stop_reason": null,
  "last_received_sequence": 1842,
  "missing_sequences": [1801, 1802],
  "audio_object_prefix": "meetings/meeting-uuid/chunks/",
  "degraded_reasons": ["ml_unavailable"],
  "max_duration_seconds": 14400,
  "ml_session_id": "ml-session-uuid"
}
```

Valid statuses: `active`, `stopping`, `composing`, `completed`, `failed`.

## REST API [#rest-api]

### `GET /meetings/{id}/recording` [#get-meetingsidrecording]

Returns the recording state for a meeting, including audio-chunk continuity and degradation state. Returns `404` if the meeting has never been recorded.

|                   |                                                                          |
| ----------------- | ------------------------------------------------------------------------ |
| **Auth**          | `bearerAuth`                                                             |
| **Response**      | `200 MeetingRecording`                                                   |
| **Cache-Control** | `private, no-store`                                                      |
| **Errors**        | `404` meeting not found or never recorded; `403` belongs to another user |

### `GET /meetings/{id}/recording/missing-chunks` [#get-meetingsidrecordingmissing-chunks]

Returns the chunk sequences Core has not durably stored in GCS. The app calls this after reconnect or stop to determine which OPFS chunks to upload.

|                   |                        |
| ----------------- | ---------------------- |
| **Auth**          | `bearerAuth`           |
| **Response**      | `200 MissingChunkList` |
| **Cache-Control** | `private, no-store`    |

```json
{
  "meeting_id": "meeting-uuid",
  "missing_sequences": [1801, 1802],
  "accepted_mime_types": ["audio/webm"],
  "max_chunk_bytes": 1048576
}
```

### `GET /meetings/{id}/recording/chunk-inventory` — **New** (Diagnostic) [#get-meetingsidrecordingchunk-inventory--new-diagnostic]

Returns the full chunk inventory for a recording. Admin/diagnostic use only — not called by the app during normal operation.

|                   |                      |
| ----------------- | -------------------- |
| **Auth**          | service auth         |
| **Response**      | `200 ChunkInventory` |
| **Cache-Control** | `private, no-store`  |

```json
{
  "meeting_id": "meeting-uuid",
  "total_chunks_stored": 36000,
  "highest_contiguous_sequence": 35998,
  "gaps": [35999, 36000],
  "total_bytes": 172800000,
  "composition_status": "pending",
  "first_chunk_at": "2026-05-01T09:00:00Z",
  "last_chunk_at": "2026-05-01T10:00:00Z"
}
```

### `POST /meetings/{id}/recording/chunks` [#post-meetingsidrecordingchunks]

Uploads OPFS gap chunks. Core verifies `sha256`, de-duplicates by sequence, stores each chunk at `meetings/{meeting_id}/chunks/{sequence}.webm`, and returns the remaining gap set. Only `multipart/form-data` is accepted — no base64-encoded JSON.

|                          |                                                                              |
| ------------------------ | ---------------------------------------------------------------------------- |
| **Auth**                 | `bearerAuth`                                                                 |
| **Idempotency**          | Required                                                                     |
| **Request Content-Type** | `multipart/form-data`                                                        |
| **Response**             | `200 GapUploadResult` + `Location: /meetings/{id}/recording`                 |
| **Errors**               | `409` audio already composed; `413` chunk too large; `422` checksum mismatch |

Each part includes:

| Field           | Type    | Description                            |
| --------------- | ------- | -------------------------------------- |
| `sequence`      | integer | Monotonic chunk sequence number        |
| `started_at_ms` | integer | Chunk start offset in milliseconds     |
| `duration_ms`   | integer | Chunk duration in milliseconds         |
| `mime_type`     | string  | `audio/webm`                           |
| `sha256`        | string  | Hex-encoded SHA-256 of the audio bytes |
| `audio`         | binary  | Raw audio chunk bytes                  |

```json
{
  "meeting_id": "meeting-uuid",
  "accepted_sequences": [1801],
  "remaining_missing_sequences": [1802],
  "last_contiguous_sequence": 1842
}
```

## Real-Time Events [#real-time-events]

### Browser → Core [#browser--core]

#### `StartRecordingCommand` [#startrecordingcommand]

Starts a live recording for a meeting. If the user already has an active recording, Core returns `RecordingErrorEvent` with `code: "session_conflict"`.

```json
{
  "specversion": "1.0",
  "id": "command-uuid",
  "source": "wordloop-app/ws",
  "type": "com.wordloop.recording.start.v1",
  "time": "2026-05-01T09:00:00Z",
  "traceparent": "00-...",
  "data": {
    "meeting_id": "meeting-uuid",
    "client_recording_id": "browser-generated-uuid",
    "audio_config": {
      "encoding": "webm",
      "sample_rate": 48000,
      "channels": 1,
      "chunk_duration_ms": 100
    },
    "max_duration_seconds": 14400
  }
}
```

#### `StopRecordingCommand` [#stoprecordingcommand]

Stops the recording. The app includes the last sequence written to OPFS so Core can report gaps precisely.

```json
{
  "specversion": "1.0",
  "id": "command-uuid",
  "source": "wordloop-app/ws",
  "type": "com.wordloop.recording.stop.v1",
  "time": "2026-05-01T10:00:00Z",
  "traceparent": "00-...",
  "data": {
    "meeting_id": "meeting-uuid",
    "last_client_sequence": 36000,
    "opfs_manifest_sha256": "hex-encoded-sha256"
  }
}
```

#### `ResumeRecordingCommand` — **New** [#resumerecordingcommand--new]

Sent by the app after a WebSocket reconnect during an active recording. Carries the client's last known sequence so Core can report the GCS gap.

```json
{
  "specversion": "1.0",
  "id": "command-uuid",
  "source": "wordloop-app/ws",
  "type": "com.wordloop.recording.resume.v1",
  "time": "2026-05-01T09:30:00Z",
  "traceparent": "00-...",
  "data": {
    "meeting_id": "meeting-uuid",
    "last_client_sequence": 18000
  }
}
```

### Core → Browser [#core--browser]

#### `RecordingStartedEvent` [#recordingstartedevent]

Confirms that Core, ML, storage, and transcription pre-warm are ready. The banner transitions from **Connecting** to **Recording**.

```json
{
  "specversion": "1.0",
  "id": "event-uuid",
  "source": "wordloop-core/ws",
  "type": "com.wordloop.recording.started.v1",
  "time": "2026-05-01T09:00:00Z",
  "traceparent": "00-...",
  "data": {
    "meeting_id": "meeting-uuid",
    "ml_session_id": "ml-session-uuid",
    "started_at": "2026-05-01T09:00:00Z",
    "max_duration_seconds": 14400
  }
}
```

#### `RecordingResumedEvent` — **New** [#recordingresumedevent--new]

Sent in response to `ResumeRecordingCommand`. Tells the app where GCS stands so the app knows which OPFS chunks to upload.

```json
{
  "specversion": "1.0",
  "id": "event-uuid",
  "source": "wordloop-core/ws",
  "type": "com.wordloop.recording.resumed.v1",
  "time": "2026-05-01T09:30:00Z",
  "traceparent": "00-...",
  "data": {
    "meeting_id": "meeting-uuid",
    "last_stored_sequence": 17500,
    "missing_sequences": [17501, 17502],
    "ml_session_id": "ml-session-uuid"
  }
}
```

#### `GapUploadCompleteEvent` — **New** [#gapuploadcompleteevent--new]

Confirms that all gap chunks have been received and stored. The app can clear OPFS and resume normal operation.

```json
{
  "specversion": "1.0",
  "id": "event-uuid",
  "source": "wordloop-core/ws",
  "type": "com.wordloop.recording.gap_upload_complete.v1",
  "time": "2026-05-01T09:31:00Z",
  "traceparent": "00-...",
  "data": {
    "meeting_id": "meeting-uuid",
    "last_stored_sequence": 18000
  }
}
```

#### `RecordingStoppedEvent` [#recordingstoppedevent]

Confirms recording has stopped. The client calls `GET /meetings/{id}/recording/missing-chunks` to determine which OPFS gap chunks to upload before audio composition can proceed.

```json
{
  "specversion": "1.0",
  "id": "event-uuid",
  "source": "wordloop-core/ws",
  "type": "com.wordloop.recording.stopped.v1",
  "time": "2026-05-01T10:00:00Z",
  "traceparent": "00-...",
  "data": {
    "meeting_id": "meeting-uuid",
    "reason": "user_requested",
    "last_received_sequence": 35998,
    "last_client_sequence": 36000,
    "post_processing_started": true
  }
}
```

Valid reasons: `user_requested`, `duration_limit`, `connection_closed`, `server_shutdown`, `storage_failure`.

#### `RecordingDurationWarningEvent` [#recordingdurationwarningevent]

Warns the client before server-side auto-stop.

```json
{
  "specversion": "1.0",
  "id": "event-uuid",
  "source": "wordloop-core/ws",
  "type": "com.wordloop.recording.duration_warning.v1",
  "time": "2026-05-01T12:50:00Z",
  "traceparent": "00-...",
  "data": {
    "meeting_id": "meeting-uuid",
    "remaining_seconds": 600,
    "auto_stop_at": "2026-05-01T13:00:00Z"
  }
}
```

#### `RecordingErrorEvent` [#recordingerrorevent]

Reports degraded or failed recording conditions. Recoverable conditions include a paired recovery code (e.g., `ml_unavailable` → `ml_recovered`).

```json
{
  "specversion": "1.0",
  "id": "event-uuid",
  "source": "wordloop-core/ws",
  "type": "com.wordloop.recording.error.v1",
  "time": "2026-05-01T09:10:00Z",
  "traceparent": "00-...",
  "data": {
    "meeting_id": "meeting-uuid",
    "code": "ml_unavailable",
    "severity": "degraded",
    "message": "Live insights paused. Audio is still recording.",
    "retry_after_ms": 1000
  }
}
```

Valid codes: `ml_unavailable`, `ml_recovered`, `storage_unavailable`, `storage_recovered`, `insight_warning`, `transcoder_error`, `no_audio_detected`, `session_conflict`, `audio_checksum_mismatch`, `backpressure`.

#### `AudioChunkStoredEvent` — **New** [#audiochunkstoredevent--new]

Periodically reports the highest contiguous sequence number durably stored in GCS. The client uses this to trim the OPFS shadow buffer during normal operation — without it, OPFS grows unboundedly during long sessions. Core emits this event every 10 seconds (or every 100 chunks, whichever comes first) during an active recording.

```json
{
  "specversion": "1.0",
  "id": "event-uuid",
  "source": "wordloop-core/ws",
  "type": "com.wordloop.recording.audio_chunk_stored.v1",
  "time": "2026-05-01T09:05:00Z",
  "traceparent": "00-...",
  "data": {
    "meeting_id": "meeting-uuid",
    "highest_contiguous_sequence": 5000,
    "total_chunks_stored": 5000
  }
}
```

## ML Integration [#ml-integration]

### Core → ML [#core--ml]

#### REST: `POST /meetings/{id}/live-session` [#rest-post-meetingsidlive-session]

Creates and pre-warms the ML side of a live recording. ML opens the upstream AssemblyAI session, loads speaker states (pushed by Core in the request body), prepares the insight pipeline, and returns the WebSocket URL.

|                 |                                                                                           |
| --------------- | ----------------------------------------------------------------------------------------- |
| **Auth**        | service auth                                                                              |
| **Idempotency** | Required; key maps to `(meeting_id, transcription_id)`                                    |
| **Response**    | `201 Created` with `MLLiveSession` + `Location: /meetings/{id}/live-session`              |
| **Errors**      | `409` active session already exists for meeting; `503` transcription provider unavailable |

**Request:**

```json
{
  "meeting_id": "meeting-uuid",
  "transcription_id": "transcription-uuid",
  "user_id": "user-uuid",
  "audio_config": {
    "encoding": "webm",
    "sample_rate": 48000,
    "channels": 1,
    "chunk_duration_ms": 100
  },
  "speaker_states": [
    {
      "speaker_label": "speaker_1",
      "state": "manual",
      "person_id": "person-uuid",
      "attempt_count": 0
    }
  ],
  "voice_profiles": [
    {
      "person_id": "person-uuid",
      "embedding_model": "ecapa-tdnn-v1",
      "embedding": [0.12, -0.34]
    }
  ],
  "insight_policy": {
    "talking_point_cadence_seconds": 30,
    "talking_point_cadence_segments": 5,
    "task_extraction": "live"
  }
}
```

**Response:**

```json
{
  "id": "ml-session-uuid",
  "meeting_id": "meeting-uuid",
  "status": "ready",
  "websocket_url": "wss://ml.internal/meetings/meeting-uuid/live-session/stream",
  "expires_at": "2026-05-01T13:00:00Z",
  "max_duration_seconds": 14400
}
```

#### REST: `GET /meetings/{id}/live-session` [#rest-get-meetingsidlive-session]

Returns ML's authoritative view of a live session. Core uses this for diagnostics and recovery decisions.

|              |                                     |
| ------------ | ----------------------------------- |
| **Auth**     | service auth                        |
| **Response** | `200 MLLiveSessionStatus`           |
| **Errors**   | `404` no active session for meeting |

```json
{
  "id": "ml-session-uuid",
  "meeting_id": "meeting-uuid",
  "status": "streaming",
  "last_audio_sequence_received": 1842,
  "last_audio_sequence_processed": 1841,
  "last_output_event_id": "event-uuid",
  "speaker_state_count": 3,
  "context_buffer_segment_count": 45,
  "degraded_reasons": []
}
```

Valid statuses: `created`, `ready`, `streaming`, `draining`, `completed`, `failed`, `expired`.

#### REST: `POST /meetings/{id}/live-session/drain` [#rest-post-meetingsidlive-sessiondrain]

Requests a graceful drain of the live session. Core normally sends `DrainCommand` over the ML WebSocket first; this REST endpoint is the idempotent fallback for control-plane cleanup. Uses POST (not DELETE) to carry the request body and express intent clearly.

|                 |                                                                    |
| --------------- | ------------------------------------------------------------------ |
| **Auth**        | service auth                                                       |
| **Idempotency** | Required                                                           |
| **Response**    | `202 Accepted` while draining; `204 No Content` if already drained |
| **Errors**      | `404` no active session for meeting                                |

```json
{
  "reason": "user_requested",
  "last_received_sequence": 35998,
  "audio_composed_path": "gs://wordloop-audio/meetings/meeting-uuid/audio.webm"
}
```

#### REST: `POST /meetings/{id}/live-session/speaker-states` [#rest-post-meetingsidlive-sessionspeaker-states]

Pushes a speaker-state update to ML outside the live WebSocket path. The normal path is `SpeakerStateUpdatedEvent` over WebSocket; REST is the fallback when Core recovers a session and needs to reconcile state. For the full speaker identification pipeline, see [Person & Speaker Identity](person).

|                 |                                                          |
| --------------- | -------------------------------------------------------- |
| **Auth**        | service auth                                             |
| **Idempotency** | Required                                                 |
| **Response**    | `204 No Content`                                         |
| **Errors**      | `404` no active session; `409` session already completed |

```json
{
  "speaker_label": "speaker_1",
  "state": "manual",
  "person_id": "person-uuid",
  "updated_at": "2026-05-01T09:15:00Z"
}
```

#### WebSocket: `StreamStartEvent` [#websocket-streamstartevent]

Sent immediately after WebSocket open. Confirms the session and gives ML the replay point for reconnects. **Includes current speaker states and voice profiles** so ML can reconstruct its in-memory state on every connection — initial or reconnect.

```json
{
  "specversion": "1.0",
  "id": "event-uuid",
  "source": "wordloop-core/ml-ws",
  "type": "com.wordloop.ml.stream.start.v1",
  "time": "2026-05-01T09:00:00Z",
  "traceparent": "00-...",
  "data": {
    "meeting_id": "meeting-uuid",
    "ml_session_id": "ml-session-uuid",
    "transcription_id": "transcription-uuid",
    "last_audio_sequence": 0,
    "last_ml_event_id": null,
    "speaker_states": [
      {
        "speaker_label": "speaker_1",
        "state": "manual",
        "person_id": "person-uuid",
        "attempt_count": 0
      }
    ],
    "voice_profiles": [
      {
        "person_id": "person-uuid",
        "embedding_model": "ecapa-tdnn-v1",
        "embedding": [0.12, -0.34]
      }
    ]
  }
}
```

#### WebSocket: `DrainCommand` [#websocket-draincommand]

Requests a graceful drain. ML closes upstream transcription, flushes final segments, emits `StreamDrainedEvent`, then closes the WebSocket.

```json
{
  "specversion": "1.0",
  "id": "command-uuid",
  "source": "wordloop-core/ml-ws",
  "type": "com.wordloop.ml.recording.drain.v1",
  "time": "2026-05-01T10:00:00Z",
  "traceparent": "00-...",
  "data": {
    "meeting_id": "meeting-uuid",
    "reason": "user_requested",
    "last_audio_sequence": 35998,
    "audio_composed_path": "gs://wordloop-audio/meetings/meeting-uuid/audio.webm"
  }
}
```

### ML → Core [#ml--core]

#### WebSocket: `StreamReadyEvent` [#websocket-streamreadyevent]

Confirms ML has opened the upstream transcription stream and is ready to receive audio.

```json
{
  "specversion": "1.0",
  "id": "event-uuid",
  "source": "wordloop-ml/ws",
  "type": "com.wordloop.ml.stream.ready.v1",
  "time": "2026-05-01T09:00:01Z",
  "traceparent": "00-...",
  "data": {
    "meeting_id": "meeting-uuid",
    "ml_session_id": "ml-session-uuid"
  }
}
```

#### WebSocket: `StreamDrainedEvent` [#websocket-streamdrainedevent]

Final ML event for a live session. Core can now close the socket, verify final audio, and publish post-meeting work.

```json
{
  "specversion": "1.0",
  "id": "event-uuid",
  "source": "wordloop-ml/ws",
  "type": "com.wordloop.ml.stream.drained.v1",
  "time": "2026-05-01T10:00:10Z",
  "traceparent": "00-...",
  "data": {
    "meeting_id": "meeting-uuid",
    "ml_session_id": "ml-session-uuid",
    "last_audio_sequence_processed": 35998,
    "final_segment_count": 812,
    "closed_provider_sessions": ["assemblyai"]
  }
}
```

## Pub/Sub [#pubsub]

### `meeting.session.terminated.v1` [#meetingsessionterminatedv1]

Signals ML to drain and finalise a live streaming session. Core publishes this when the user stops, the duration limit is reached, or the client disappears.

|                      |                                              |
| -------------------- | -------------------------------------------- |
| **Producer**         | Core                                         |
| **Consumer**         | ML streaming coordinator                     |
| **CloudEvents type** | `com.wordloop.meeting.session.terminated.v1` |
| **Ordering key**     | `meeting_id`                                 |
| **Dead-letter**      | `meeting-session-terminated-dlq`             |

```json
{
  "ml_session_id": "ml-session-uuid",
  "meeting_id": "meeting-uuid",
  "user_id": "user-uuid",
  "reason": "user_requested",
  "last_received_sequence": 35998,
  "audio_storage_prefix": "gs://wordloop-audio/meetings/meeting-uuid/chunks/",
  "audio_composed_path": "gs://wordloop-audio/meetings/meeting-uuid/audio.webm"
}
```

Valid reasons: `user_requested`, `duration_limit`, `connection_closed`, `server_shutdown`, `storage_failure`.


# Synthesis (/docs/work/meeting-recording/tdd/contracts/synthesis)


# Synthesis [#synthesis]

Synthesis artefacts are the ML-generated summary, topics, and talking points for a meeting. During a live session, talking points stream incrementally. After post-meeting processing, synthesis is atomically replaced with the final version. Headline is a separate meeting-level field — see [Meeting](meeting). For shared semantics, see [Infrastructure](infrastructure).

## Resource Shape [#resource-shape]

```json
{
  "summary": "The team aligned on rollout sequencing and follow-up owners.",
  "topics": [
    {
      "id": "topic-uuid",
      "name": "Launch readiness",
      "summary": "Discussed go/no-go criteria for next week's launch.",
      "is_final": true,
      "segments": [{ "segment_id": "segment-uuid" }]
    }
  ],
  "talking_points": [
    {
      "id": "talking-point-uuid",
      "content": "Design review is the next blocker.",
      "is_final": true,
      "segments": [{ "segment_id": "segment-uuid" }],
      "topic_id": "topic-uuid"
    }
  ]
}
```

## REST API [#rest-api]

### `GET /meetings/{id}/synthesis` [#get-meetingsidsynthesis]

Returns the synthesis artefacts for a meeting.

|              |                                                        |
| ------------ | ------------------------------------------------------ |
| **Auth**     | `bearerAuth`                                           |
| **Response** | `200 MeetingSynthesis`                                 |
| **Errors**   | `404` meeting not found or synthesis not yet generated |

### `PUT /meetings/{id}/synthesis` — ML Write-Back [#put-meetingsidsynthesis--ml-write-back]

Atomically replaces final synthesis artefacts: summary, topics, and final talking points.

|                  |                                                                                                      |
| ---------------- | ---------------------------------------------------------------------------------------------------- |
| **Auth**         | service auth                                                                                         |
| **Idempotency**  | Required                                                                                             |
| **Response**     | `204 No Content`                                                                                     |
| **Side effects** | Broadcasts `SynthesisUpdatedEvent` and `EntityChangedEvent { entity: "meeting", action: "updated" }` |

```json
{
  "summary": "The team aligned on rollout sequencing and follow-up owners.",
  "topics": [
    {
      "title": "Launch readiness",
      "segment_ids": ["segment-uuid"]
    }
  ],
  "talking_points": [
    {
      "content": "Design review is the next blocker.",
      "segment_ids": ["segment-uuid"]
    }
  ]
}
```

### `GET /meetings/{id}/talking-points` [#get-meetingsidtalking-points]

Returns talking points for a meeting. During a live session, includes draft (non-final) talking points.

|              |                        |
| ------------ | ---------------------- |
| **Auth**     | `bearerAuth`           |
| **Response** | `200 TalkingPointList` |

### `POST /meetings/{id}/talking-points` — ML Write-Back [#post-meetingsidtalking-points--ml-write-back]

Creates or updates a live talking point emitted by ML.

|                  |                                                                                                   |
| ---------------- | ------------------------------------------------------------------------------------------------- |
| **Auth**         | service auth                                                                                      |
| **Idempotency**  | Required                                                                                          |
| **Response**     | `201 Created` or `200 OK` with `TalkingPoint` + `Location: /meetings/{id}/talking-points/{tp_id}` |
| **Side effects** | Broadcasts `TalkingPointEvent` for live clients; `EntityChangedEvent` for cache revalidation      |

```json
{
  "id": "talking-point-uuid",
  "content": "The rollout plan needs design review before launch.",
  "segment_ids": ["segment-uuid"],
  "is_final": false
}
```

### `GET /meetings/{id}/topics` [#get-meetingsidtopics]

Returns topics extracted from the meeting transcript.

|              |                 |
| ------------ | --------------- |
| **Auth**     | `bearerAuth`    |
| **Response** | `200 TopicList` |

***

## Real-Time Events [#real-time-events]

### Core → Browser [#core--browser]

#### `TalkingPointEvent` [#talkingpointevent]

Streams a live talking point without forcing the client to re-fetch synthesis.

```json
{
  "specversion": "1.0",
  "id": "event-uuid",
  "source": "wordloop-core/ws",
  "type": "com.wordloop.meeting.talking_point.v1",
  "time": "2026-05-01T09:04:00Z",
  "traceparent": "00-...",
  "data": {
    "meeting_id": "meeting-uuid",
    "talking_point": {
      "id": "talking-point-uuid",
      "content": "The rollout plan needs design review before launch.",
      "segment_ids": ["segment-uuid"],
      "is_final": false
    }
  }
}
```

#### `SynthesisUpdatedEvent` — **New** [#synthesisupdatedevent--new]

Signals that synthesis artefacts (summary, topics, talking points) have been updated. Fired after `PUT /meetings/{id}/synthesis` completes. Clients should reload via `GET /meetings/{id}/synthesis`.

```json
{
  "specversion": "1.0",
  "id": "event-uuid",
  "source": "wordloop-core/ws",
  "type": "com.wordloop.meeting.synthesis.updated.v1",
  "time": "2026-05-01T10:05:00Z",
  "traceparent": "00-...",
  "data": {
    "meeting_id": "meeting-uuid",
    "version": 2
  }
}
```

***

## ML Integration [#ml-integration]

### ML → Core [#ml--core]

#### WebSocket: `TalkingPointProducedEvent` [#websocket-talkingpointproducedevent]

Emits a live talking point from the batched insight pipeline. Talking points and [tasks](task) are extracted by the same LLM structured-output call.

```json
{
  "specversion": "1.0",
  "id": "event-uuid",
  "source": "wordloop-ml/ws",
  "type": "com.wordloop.ml.talking_point.v1",
  "time": "2026-05-01T09:04:00Z",
  "traceparent": "00-...",
  "data": {
    "meeting_id": "meeting-uuid",
    "talking_point": {
      "id": "talking-point-uuid",
      "content": "The rollout plan needs design review before launch.",
      "segment_ids": ["segment-uuid"],
      "is_final": false
    }
  }
}
```


# Task (/docs/work/meeting-recording/tdd/contracts/task)


# Task [#task]

Tasks are action items associated with a meeting. They can be user-created or system-generated by ML during live recording. Tasks support nesting (sub-tasks), assignment to people, and due dates. For shared semantics, see [Infrastructure](infrastructure).

## Resource Shape [#resource-shape]

```json
{
  "id": "task-uuid",
  "meeting_id": "meeting-uuid",
  "content": "Send the rollout plan to design",
  "status": "pending",
  "source": "system",
  "assigned_to": "person-uuid",
  "due_date": "2026-05-08",
  "parent_task_id": null,
  "sub_task_summary": { "total": 2, "completed": 1 },
  "created_at": "2026-05-01T09:04:30Z"
}
```

Valid `source` values: `user`, `system`. Editing a system-generated task promotes it to `user`.

Valid `status` values: `pending`, `completed`.

## REST API [#rest-api]

### `GET /meetings/{id}/tasks` [#get-meetingsidtasks]

Returns tasks for a specific meeting.

|                  |                   |
| ---------------- | ----------------- |
| **Auth**         | `bearerAuth`      |
| **Response**     | `200 TaskList`    |
| **Query params** | `cursor`, `limit` |

### `GET /tasks` [#get-tasks]

Lists all tasks across meetings for the authenticated user.

|                  |                                                          |
| ---------------- | -------------------------------------------------------- |
| **Auth**         | `bearerAuth`                                             |
| **Response**     | `200 TaskList`                                           |
| **Query params** | `cursor`, `limit`, `meeting_id`, `status`, `assigned_to` |

### `POST /tasks` [#post-tasks]

Creates a task. User-created tasks come from the app; system-generated tasks come from ML through Core.

|                      |                                                                                                                                                                         |
| -------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **Auth**             | `bearerAuth` or service auth                                                                                                                                            |
| **Idempotency**      | Required                                                                                                                                                                |
| **Echo suppression** | `Client-Session-Id` optional                                                                                                                                            |
| **Response**         | `201 Created` with `Task` + `Location: /tasks/{id}`                                                                                                                     |
| **Side effects**     | Broadcasts `EntityChangedEvent { entity: "task", action: "created" }`. During live sessions, Core also emits `TaskEvent` with the full task for immediate UI insertion. |

```json
{
  "meeting_id": "meeting-uuid",
  "content": "Send the rollout plan to design",
  "assigned_to": "person-uuid",
  "due_date": "2026-05-08",
  "parent_task_id": null,
  "source": "user"
}
```

### `GET /tasks/{id}` [#get-tasksid]

Returns a single task with sub-task summary.

|              |                      |
| ------------ | -------------------- |
| **Auth**     | `bearerAuth`         |
| **Response** | `200 Task`           |
| **Errors**   | `404` task not found |

### `PATCH /tasks/{id}` [#patch-tasksid]

Updates a task. Editing a system-generated task promotes it to `source: "user"`.

|                      |                                                                       |
| -------------------- | --------------------------------------------------------------------- |
| **Auth**             | `bearerAuth`                                                          |
| **Echo suppression** | `Client-Session-Id` optional                                          |
| **Response**         | `200 Task`                                                            |
| **Side effects**     | Broadcasts `EntityChangedEvent { entity: "task", action: "updated" }` |

```json
{
  "content": "Send the rollout plan to design and engineering",
  "status": "completed",
  "source": "user"
}
```

### `DELETE /tasks/{id}` [#delete-tasksid]

Deletes a task and cascades to sub-tasks.

|                      |                                                                       |
| -------------------- | --------------------------------------------------------------------- |
| **Auth**             | `bearerAuth`                                                          |
| **Echo suppression** | `Client-Session-Id` optional                                          |
| **Response**         | `204 No Content`                                                      |
| **Side effects**     | Broadcasts `EntityChangedEvent { entity: "task", action: "deleted" }` |

### `GET /tasks/{id}/sub-tasks` [#get-tasksidsub-tasks]

Returns sub-tasks for a parent task.

|              |                |
| ------------ | -------------- |
| **Auth**     | `bearerAuth`   |
| **Response** | `200 TaskList` |

### `POST /tasks/{id}/sub-tasks` [#post-tasksidsub-tasks]

Creates a sub-task nested under a parent task.

|                      |                                                                       |
| -------------------- | --------------------------------------------------------------------- |
| **Auth**             | `bearerAuth`                                                          |
| **Idempotency**      | Required                                                              |
| **Echo suppression** | `Client-Session-Id` optional                                          |
| **Response**         | `201 Created` with `Task` + `Location: /tasks/{sub_id}`               |
| **Side effects**     | Broadcasts `EntityChangedEvent { entity: "task", action: "created" }` |

***

## Real-Time Events [#real-time-events]

### Core → Browser [#core--browser]

#### `TaskEvent` [#taskevent]

Carries system-generated tasks produced during live recording. User-originated task mutations use `EntityChangedEvent` only because the optimistic local state already has the full payload.

```json
{
  "specversion": "1.0",
  "id": "event-uuid",
  "source": "wordloop-core/ws",
  "type": "com.wordloop.meeting.task.v1",
  "time": "2026-05-01T09:04:30Z",
  "traceparent": "00-...",
  "data": {
    "meeting_id": "meeting-uuid",
    "task": {
      "id": "task-uuid",
      "content": "Send the rollout plan to design",
      "assigned_to": null,
      "due_date": null,
      "parent_task_id": null,
      "source": "system",
      "status": "pending"
    }
  }
}
```

***

## ML Integration [#ml-integration]

### ML → Core [#ml--core]

#### WebSocket: `TaskProducedEvent` [#websocket-taskproducedevent]

Emits a live system task from the same structured-output call as [talking points](synthesis). Core persists the task via `POST /tasks` and fans out `TaskEvent` to the browser.

```json
{
  "specversion": "1.0",
  "id": "event-uuid",
  "source": "wordloop-ml/ws",
  "type": "com.wordloop.ml.task.v1",
  "time": "2026-05-01T09:04:30Z",
  "traceparent": "00-...",
  "data": {
    "meeting_id": "meeting-uuid",
    "task": {
      "id": "task-uuid",
      "content": "Send the rollout plan to design",
      "assigned_to": null,
      "due_date": null,
      "parent_task_id": null,
      "source": "system"
    }
  }
}
```


# Transcription (/docs/work/meeting-recording/tdd/contracts/transcription)


# Transcription [#transcription]

A transcription tracks the processing lifecycle for a meeting's audio. Each meeting has at most one transcription. Transcript segments are the individual speaker-attributed text fragments produced during live recording and refined during post-meeting batch processing. For shared semantics, see [Infrastructure](infrastructure).

## Resource Shapes [#resource-shapes]

### Transcription [#transcription-1]

```json
{
  "id": "transcription-uuid",
  "meeting_id": "meeting-uuid",
  "status": "transcribing",
  "status_message": "Batch transcription in progress",
  "progress_percent": 45,
  "is_degraded": false,
  "created_at": "2026-05-01T09:00:00Z",
  "updated_at": "2026-05-01T10:01:00Z"
}
```

Valid statuses: `pending`, `transcribing`, `synthesizing`, `completed`, `failed`.

* **`pending`** — created but processing has not started (e.g., waiting for audio upload or first byte of live audio).
* **`transcribing`** — batch transcription and diarisation are in progress.
* **`synthesizing`** — transcript is complete; headline, summary, topics, and talking points are being generated.
* **`completed`** — all artefacts are final.
* **`failed`** — processing failed; `status_message` carries the reason.

### Transcript Segment [#transcript-segment]

```json
{
  "id": "segment-uuid",
  "source_sequence": 1842,
  "revision": 2,
  "speaker_label": "speaker_1",
  "person_id": "person-uuid",
  "text": "Let's follow up tomorrow.",
  "start_ms": 183900,
  "end_ms": 185100,
  "confidence": 0.94,
  "is_final": true,
  "feature_vector": [0.12, -0.34]
}
```

`source_sequence` is assigned by ML as a monotonic counter per transcription session. It is independent of the audio chunk sequence number — the relationship between audio chunks and transcript segments is not 1:1 (one chunk may produce zero or multiple segments). Deduplication uses `(transcription_id, source_sequence, revision)`.

## REST API [#rest-api]

### `GET /meetings/{id}/transcriptions` [#get-meetingsidtranscriptions]

Lists transcriptions for a meeting (currently always 0 or 1).

|              |                         |
| ------------ | ----------------------- |
| **Auth**     | `bearerAuth`            |
| **Response** | `200 TranscriptionList` |

### `GET /transcriptions/{id}` [#get-transcriptionsid]

Returns transcription metadata and processing status.

|              |                               |
| ------------ | ----------------------------- |
| **Auth**     | `bearerAuth`                  |
| **Response** | `200 Transcription`           |
| **Errors**   | `404` transcription not found |

### `GET /transcriptions/{id}/segments` [#get-transcriptionsidsegments]

Returns transcript segments with cursor-based pagination. Supports time-range filtering for audio-synced views and ML context recovery.

|                  |                                                                               |
| ---------------- | ----------------------------------------------------------------------------- |
| **Auth**         | `bearerAuth` or service auth                                                  |
| **Response**     | `200 TranscriptSegmentList`                                                   |
| **Query params** | `cursor`, `limit` (default 100, max 500), `after_ms`, `before_ms`, `is_final` |

The `after_ms` and `before_ms` parameters filter by segment `start_ms`, enabling ML to fetch recent segments for LLM context recovery after a pod restart.

### `POST /transcriptions/{id}/segments` — ML Write-Back [#post-transcriptionsidsegments--ml-write-back]

Appends live transcript segments during an active session. Used for low-latency durable writes.

|                  |                                                                                                                                                                  |
| ---------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **Auth**         | service auth                                                                                                                                                     |
| **Idempotency**  | De-duplicates by `(transcription_id, source_sequence, revision)` — `source_sequence` is ML-assigned (monotonic per session), not the audio chunk sequence number |
| **Response**     | `204 No Content`                                                                                                                                                 |
| **Side effects** | Broadcasts `TranscriptSegmentEvent` for live clients and `EntityChangedEvent { entity: "transcript_segment" }` for cache revalidation                            |

```json
{
  "segments": [
    {
      "id": "segment-uuid",
      "source_sequence": 1842,
      "revision": 1,
      "speaker_label": "speaker_1",
      "person_id": null,
      "text": "Let's follow up tomorrow.",
      "start_ms": 183900,
      "end_ms": 185100,
      "confidence": 0.94,
      "is_final": true
    }
  ]
}
```

### `PUT /transcriptions/{id}/segments` — ML Write-Back [#put-transcriptionsidsegments--ml-write-back]

Atomically replaces all transcript segments after batch transcription completes. This is the post-meeting quality pass — a new transcript version.

|                  |                                                                                                            |
| ---------------- | ---------------------------------------------------------------------------------------------------------- |
| **Auth**         | service auth                                                                                               |
| **Idempotency**  | Required                                                                                                   |
| **Response**     | `204 No Content`                                                                                           |
| **Errors**       | `404` transcription not found; `409` live session still active                                             |
| **Side effects** | Broadcasts `TranscriptRevisedEvent` (not `EntityChangedEvent`) — clients must reload the full segment list |

### `PATCH /transcriptions/{id}/status` — ML Write-Back [#patch-transcriptionsidstatus--ml-write-back]

Updates processing state for the Meeting Summary progress indicator.

|                  |                                                                                                                            |
| ---------------- | -------------------------------------------------------------------------------------------------------------------------- |
| **Auth**         | service auth                                                                                                               |
| **Response**     | `204 No Content`                                                                                                           |
| **Side effects** | Inserts `transcription_status_history` row; broadcasts `EntityChangedEvent { entity: "transcription", action: "updated" }` |

```json
{
  "status": "synthesizing",
  "message": "Generating summary and talking points",
  "progress_percent": 75
}
```

***

## Real-Time Events [#real-time-events]

### Core → Browser [#core--browser]

#### `TranscriptSegmentEvent` [#transcriptsegmentevent]

Carries a full segment for immediate rendering. Interim segments are replaced in-place by later events with the same `id` and a higher `revision`.

```json
{
  "specversion": "1.0",
  "id": "event-uuid",
  "source": "wordloop-core/ws",
  "type": "com.wordloop.transcript.segment.v1",
  "time": "2026-05-01T09:03:05Z",
  "traceparent": "00-...",
  "data": {
    "meeting_id": "meeting-uuid",
    "transcription_id": "transcription-uuid",
    "segment": {
      "id": "segment-uuid",
      "revision": 2,
      "source_sequence": 1842,
      "speaker_label": "speaker_1",
      "person_id": null,
      "text": "Let's follow up tomorrow.",
      "start_ms": 183900,
      "end_ms": 185100,
      "confidence": 0.94,
      "is_final": true
    }
  }
}
```

#### `TranscriptRevisedEvent` — **New** [#transcriptrevisedevent--new]

Signals that the entire transcript has been replaced by a post-meeting quality pass. Clients must reload the full segment list via `GET /transcriptions/{id}/segments`. This replaces the ambiguous `EntityChangedEvent { entity: "transcript_segment" }` for the bulk replacement case.

```json
{
  "specversion": "1.0",
  "id": "event-uuid",
  "source": "wordloop-core/ws",
  "type": "com.wordloop.transcript.revised.v1",
  "time": "2026-05-01T10:05:00Z",
  "traceparent": "00-...",
  "data": {
    "meeting_id": "meeting-uuid",
    "transcription_id": "transcription-uuid",
    "segment_count": 812,
    "version": 2
  }
}
```

***

## ML Integration [#ml-integration]

### ML → Core [#ml--core]

#### WebSocket: `TranscriptSegmentProducedEvent` [#websocket-transcriptsegmentproducedevent]

Emits an interim or final transcript segment. Core immediately fans this out to the app via WebSocket and persists it through Core REST/domain services.

```json
{
  "specversion": "1.0",
  "id": "event-uuid",
  "source": "wordloop-ml/ws",
  "type": "com.wordloop.ml.transcript.segment.v1",
  "time": "2026-05-01T09:03:05Z",
  "traceparent": "00-...",
  "data": {
    "meeting_id": "meeting-uuid",
    "transcription_id": "transcription-uuid",
    "segment": {
      "id": "segment-uuid",
      "source_sequence": 1842,
      "revision": 1,
      "speaker_label": "speaker_1",
      "person_id": null,
      "text": "Let's follow up tomorrow.",
      "start_ms": 183900,
      "end_ms": 185100,
      "confidence": 0.94,
      "is_final": true
    }
  }
}
```

#### WebSocket: `SegmentFeaturesProducedEvent` [#websocket-segmentfeaturesproducedevent]

Sends feature vectors for speaker matching and later voice-profile enrichment. Core persists vectors but does not broadcast them to the browser. For how these feed the speaker identification pipeline, see [Person & Speaker Identity](person).

```json
{
  "specversion": "1.0",
  "id": "event-uuid",
  "source": "wordloop-ml/ws",
  "type": "com.wordloop.ml.segment_features.v1",
  "time": "2026-05-01T09:03:06Z",
  "traceparent": "00-...",
  "data": {
    "meeting_id": "meeting-uuid",
    "segment_id": "segment-uuid",
    "speaker_label": "speaker_1",
    "embedding_model": "ecapa-tdnn-v1",
    "embedding": [0.12, -0.34]
  }
}
```

### ML Batch Processing [#ml-batch-processing]

Batch processing handles post-meeting transcription and synthesis. Pub/Sub is the normal trigger; REST provides a deterministic control surface for Core and tests.

#### `POST /transcription-jobs/{id}/run` [#post-transcription-jobsidrun]

Starts or resumes a post-meeting transcription job.

|                 |                                                                             |
| --------------- | --------------------------------------------------------------------------- |
| **Auth**        | service auth                                                                |
| **Idempotency** | Required                                                                    |
| **Response**    | `202 Accepted` with job status                                              |
| **Errors**      | `404` job unknown; `409` job already running with a different audio version |

```json
{
  "meeting_id": "meeting-uuid",
  "transcription_id": "transcription-uuid",
  "storage_path": "gs://wordloop-audio/meetings/meeting-uuid/audio.webm",
  "audio_version": 2,
  "task_extraction_policy": "skip",
  "speaker_profile_policy": "enrich_after_completion"
}
```

#### `GET /transcription-jobs/{id}` [#get-transcription-jobsid]

Returns ML job progress for diagnostics. Core remains the user-facing source of truth for transcription status.

|              |                                |
| ------------ | ------------------------------ |
| **Auth**     | service auth                   |
| **Response** | `200 MLTranscriptionJobStatus` |

```json
{
  "id": "transcription-uuid",
  "meeting_id": "meeting-uuid",
  "status": "transcribing",
  "progress_percent": 45,
  "current_stage": "batch_transcription",
  "started_at": "2026-05-01T10:01:00Z",
  "completed_at": null
}
```

***

## Pub/Sub [#pubsub]

### `transcription-jobs` [#transcription-jobs]

Dispatches batch transcription and synthesis work to ML after an audio upload completes or a live recording has composed `audio.webm`. This is the single actionable trigger for post-meeting processing.

|                      |                                           |
| -------------------- | ----------------------------------------- |
| **Producer**         | Core                                      |
| **Consumer**         | ML post-meeting worker                    |
| **CloudEvents type** | `com.wordloop.transcription.requested.v1` |
| **Ordering key**     | `meeting_id`                              |
| **Idempotency**      | `transcription_id` plus `audio_version`   |
| **Dead-letter**      | `transcription-jobs-dlq`                  |

```json
{
  "transcription_id": "transcription-uuid",
  "meeting_id": "meeting-uuid",
  "user_id": "user-uuid",
  "storage_path": "gs://wordloop-audio/meetings/meeting-uuid/audio.webm",
  "audio_version": 2,
  "source_type": "live",
  "task_extraction_policy": "skip",
  "speaker_profile_policy": "enrich_after_completion"
}
```

Valid `source_type` values: `upload`, `live`.

Valid `task_extraction_policy` values: `extract`, `skip`, `replace_system`. Live recordings use `skip` because tasks captured during the live session are preserved.

Valid `speaker_profile_policy` values: `enrich_after_completion`, `skip`. Controls whether ML updates voice profiles with session embeddings.

### Consumer Outcomes [#consumer-outcomes]

| Event                        | Consumer outcome                                                                                                     |
| ---------------------------- | -------------------------------------------------------------------------------------------------------------------- |
| `transcription.requested`    | ML downloads audio, runs batch transcription/synthesis, writes results to Core REST, and updates status transitions. |
| `meeting.session.terminated` | ML drains AssemblyAI, flushes final live segments via Core REST, and closes its ML WebSocket connection.             |


# Pub/Sub Events (/docs/work/delivered/live-capture/03-tdd/contracts/core/pubsub)


# Pub/Sub Events [#pubsub-events]

### `TranscriptionJobCloudEvent` [#transcriptionjobcloudevent]

Dispatched by Core when recording stops (or audio upload completes). Consumed by the ML post-meeting worker.

```yaml
data:
  transcription_id: string   # required
  meeting_id: string         # required
  storage_path: string       # required
  user_id: string            # required
  skip_tasks: boolean        # default: false — set to true for live recordings
```

### `MeetingTerminatedCloudEvent` [#meetingterminatedcloudevent]

Dispatched by Core when a live recording stops. Consumed by ML to drain its AssemblyAI pipeline.

```yaml
data:
  session_id: string    # required
  meeting_id: string    # required
  user_id: string       # required
```

***


# Core REST API (/docs/work/delivered/live-capture/03-tdd/contracts/core/rest)


# Core REST API [#core-rest-api]

### `GET /meetings/{id}/audio-url` [#get-meetingsidaudio-url]

Returns a short-lived signed URL for direct audio playback from Cloud Storage.

|                   |                                                     |
| ----------------- | --------------------------------------------------- |
| **Auth**          | `bearerAuth`                                        |
| **Cache-Control** | `private, no-store`                                 |
| **Response**      | `200 { url: string, expires_at: string }`           |
| **Errors**        | `404` meeting not found; `404` meeting has no audio |

***

### `POST /meetings` [#post-meetings]

Creates a meeting. Set `source_type: "recording"` for live recordings.

|                      |                                           |
| -------------------- | ----------------------------------------- |
| **Auth**             | `bearerAuth`                              |
| **Idempotency**      | `Idempotency-Key: <uuid>` header required |
| **Echo suppression** | `X-Client-Id: <uuid>` header              |
| **Response**         | `201 Created` with full `Meeting` body    |

***

### `PATCH /meetings/{id}` [#patch-meetingsid]

Partially updates meeting metadata. Used by ML to set the `headline` after post-meeting processing.

|                      |                                                                 |
| -------------------- | --------------------------------------------------------------- |
| **Auth**             | `bearerAuth` or `serviceAuth`                                   |
| **Echo suppression** | `X-Client-Id: <uuid>` header                                    |
| **Response**         | `200` with full `Meeting` body                                  |
| **Side effects**     | Broadcasts `EntityChanged { entity: meeting, action: updated }` |

***

### `POST /meetings/{id}/speaker-labels` [#post-meetingsidspeaker-labels]

Action endpoint that associates an AssemblyAI speaker label with a known Person. Triggers a bulk `person_id` update across all segments sharing that label and enriches the Person's voice embedding.

|                      |                                                                                                                         |
| -------------------- | ----------------------------------------------------------------------------------------------------------------------- |
| **Auth**             | `bearerAuth`                                                                                                            |
| **Idempotency**      | `Idempotency-Key: <uuid>` header required                                                                               |
| **Echo suppression** | `X-Client-Id: <uuid>` header                                                                                            |
| **Response**         | `200 { updated_count: integer }`                                                                                        |
| **Errors**           | `404` meeting not found; `422` `person_id` does not exist; `422` `speaker_label` not found in this meeting's transcript |

**Request body:**

```json
{
  "speaker_label": "Speaker A",
  "person_id": "uuid"
}
```

**Side effects:**

* All `transcript_segments` with `speaker_label = "Speaker A"` receive `person_id`
* Person's `voice_vector` updated from aggregated segment `feature_vector` values
* `EntityChanged { entity: transcript_segment, action: updated }` broadcast (echo-suppressed)
* `EntityChanged { entity: person, action: updated }` broadcast (echo-suppressed)

***

### `PUT /meetings/{id}/synthesis` [#put-meetingsidsynthesis]

Atomically replaces the meeting's AI-generated synthesis. Called by ML after post-meeting processing.

|                      |                                                                 |
| -------------------- | --------------------------------------------------------------- |
| **Auth**             | `serviceAuth`                                                   |
| **Echo suppression** | `X-Client-Id: <uuid>` header                                    |
| **Response**         | `204 No Content`                                                |
| **Side effects**     | Broadcasts `EntityChanged { entity: meeting, action: updated }` |

***

### `POST /meetings/{id}/talking-points` [#post-meetingsidtalking-points]

Creates a talking point for a meeting. Called by ML to deliver live insights per finalised segment.

|                      |                                             |
| -------------------- | ------------------------------------------- |
| **Auth**             | `serviceAuth`                               |
| **Idempotency**      | `Idempotency-Key: <uuid>` header required   |
| **Echo suppression** | `X-Client-Id: <uuid>` header                |
| **Response**         | `201 Created` with full `TalkingPoint` body |
| **Side effects**     | Broadcasts `TalkingPointEvent` on WebSocket |

***

### `POST /transcriptions/{transcriptionId}/segments` [#post-transcriptionstranscriptionidsegments]

Appends transcript segments to a transcription. Used on the live path.

|                  |                                                                         |
| ---------------- | ----------------------------------------------------------------------- |
| **Auth**         | `serviceAuth`                                                           |
| **Response**     | `204 No Content`                                                        |
| **Side effects** | Broadcasts `TranscriptSegmentEvent` on WebSocket for each final segment |

***

### `PUT /transcriptions/{transcriptionId}/segments` [#put-transcriptionstranscriptionidsegments]

Atomically replaces all segments for a transcription. Used by the post-meeting batch worker.

|                  |                                                                                                       |
| ---------------- | ----------------------------------------------------------------------------------------------------- |
| **Auth**         | `serviceAuth`                                                                                         |
| **Response**     | `204 No Content`                                                                                      |
| **Errors**       | `404` transcription not found; `409` transcription is still in `processing` state                     |
| **Side effects** | Deletes existing segments, inserts new set, broadcasts `EntityChanged { entity: transcript_segment }` |

***

### `DELETE /meetings/{meetingId}/tasks?source=system` [#delete-meetingsmeetingidtaskssourcesystem]

Deletes all system-generated tasks for a meeting. Used by the post-meeting worker when re-extracting tasks.

|              |                  |
| ------------ | ---------------- |
| **Auth**     | `serviceAuth`    |
| **Response** | `204 No Content` |

***


# Core WebSocket API (/docs/work/delivered/live-capture/03-tdd/contracts/core/websocket)


# Core WebSocket API [#core-websocket-api]

All events follow the CloudEvents v1.0 envelope. Every event that originates from a user action includes `sourceClientId` at the envelope root — the originating client discards events where `sourceClientId` matches its own `X-Client-Id`.

### `EntityChangedEvent` (Server → Client) [#entitychangedevent-server--client]

Cache-invalidation signal. Clients re-fetch the affected entity via REST on receipt.

```json
{
  "specversion": "1.0",
  "id": "<uuid>",
  "source": "wordloop-core/ws",
  "type": "com.wordloop.entity.changed.v1",
  "time": "<ISO8601>",
  "sourceClientId": "<client-uuid>",
  "data": {
    "entity": "meeting | person | task | note | transcript_segment | talking_point",
    "action": "created | updated | deleted",
    "id": "<entity-uuid>"
  }
}
```

### `TranscriptSegmentEvent` (Server → Client) [#transcriptsegmentevent-server--client]

Live transcript segment during recording. Carries the full segment payload inline.

```json
{
  "specversion": "1.0",
  "id": "<uuid>",
  "source": "wordloop-core/ws",
  "type": "com.wordloop.transcript.segment.v1",
  "time": "<ISO8601>",
  "data": {
    "meeting_id": "<uuid>",
    "segment": {
      "id": "<uuid>",
      "speaker_label": "Speaker A",
      "person_id": "<uuid | null>",
      "text": "string",
      "start_ms": 1200,
      "end_ms": 2400,
      "confidence": 0.96,
      "is_final": true
    }
  }
}
```

### `TalkingPointEvent` (Server → Client) [#talkingpointevent-server--client]

Streamed per finalised segment during live recording. Carries the full payload inline.

```json
{
  "specversion": "1.0",
  "id": "<uuid>",
  "source": "wordloop-core/ws",
  "type": "com.wordloop.meeting.talking_point.v1",
  "time": "<ISO8601>",
  "sourceClientId": "<client-uuid>",
  "data": {
    "meeting_id": "<uuid>",
    "talking_point": {
      "id": "<uuid>",
      "content": "string",
      "is_final": true,
      "segments": ["<segment-uuid>"]
    }
  }
}
```

### `TaskEvent` (Server → Client) [#taskevent-server--client]

Streamed in batches (\~60s) during live recording.

```json
{
  "specversion": "1.0",
  "id": "<uuid>",
  "source": "wordloop-core/ws",
  "type": "com.wordloop.meeting.task.v1",
  "time": "<ISO8601>",
  "sourceClientId": "<client-uuid>",
  "data": {
    "meeting_id": "<uuid>",
    "task": {
      "id": "<uuid>",
      "content": "string",
      "source": "system"
    }
  }
}
```

### `RecordingStartedEvent` (Server → Client) [#recordingstartedevent-server--client]

```json
{
  "specversion": "1.0",
  "id": "<uuid>",
  "source": "wordloop-core/ws",
  "type": "com.wordloop.recording.started.v1",
  "time": "<ISO8601>",
  "data": { "meeting_id": "<uuid>", "session_id": "<opaque-session-id>" }
}
```

### `RecordingStoppedEvent` (Server → Client) [#recordingstoppedevent-server--client]

```json
{
  "specversion": "1.0",
  "id": "<uuid>",
  "source": "wordloop-core/ws",
  "type": "com.wordloop.recording.stopped.v1",
  "time": "<ISO8601>",
  "data": { "meeting_id": "<uuid>" }
}
```

### `RecordingDegradedEvent` (Server → Client) [#recordingdegradedevent-server--client]

```json
{
  "specversion": "1.0",
  "id": "<uuid>",
  "source": "wordloop-core/ws",
  "type": "com.wordloop.recording.degraded.v1",
  "time": "<ISO8601>",
  "data": { "meeting_id": "<uuid>", "reason": "string" }
}
```

### `StartRecordingCommand` (Client → Server) [#startrecordingcommand-client--server]

```json
{
  "specversion": "1.0",
  "id": "<uuid>",
  "source": "wordloop-app/ws",
  "type": "com.wordloop.recording.start.v1",
  "time": "<ISO8601>",
  "data": {
    "meeting_id": "<uuid>",
    "audio_config": { "encoding": "pcm16 | webm | mp3", "sample_rate": 16000, "channels": 1 }
  }
}
```

### `StopRecordingCommand` (Client → Server) [#stoprecordingcommand-client--server]

```json
{
  "specversion": "1.0",
  "id": "<uuid>",
  "source": "wordloop-app/ws",
  "type": "com.wordloop.recording.stop.v1",
  "time": "<ISO8601>",
  "data": { "meeting_id": "<uuid>" }
}
```

***


# ML NDJSON Stream API (/docs/work/delivered/live-capture/03-tdd/contracts/ml/ndjson-stream)


# ML HTTP Streaming API [#ml-http-streaming-api]

The ML service exposes a bidirectional HTTP streaming endpoint. Core opens the connection on `StartRecordingCommand` and holds it open for the duration of the session.

### `POST /streaming/sessions` [#post-streamingsessions]

Creates a streaming session. The HTTP response body remains open.

|                          |                                                                                                                     |
| ------------------------ | ------------------------------------------------------------------------------------------------------------------- |
| **Auth**                 | `serviceAuth` (Core → ML, mTLS service token)                                                                       |
| **Request Content-Type** | `application/json`                                                                                                  |
| **Body**                 | `{ meeting_id, transcription_id, audio_config }`                                                                    |
| **Response**             | `201 Created` — initial JSON `{ session_id }` header line, then `Content-Type: application/x-ndjson` streaming body |
| **Errors**               | `409` a session for this `meeting_id` already exists; `503` AssemblyAI unavailable                                  |

### `POST /streaming/sessions/{session_id}/audio` [#post-streamingsessionssession_idaudio]

Delivers a binary audio chunk to the active session.

|                          |                                                            |
| ------------------------ | ---------------------------------------------------------- |
| **Auth**                 | `serviceAuth`                                              |
| **Request Content-Type** | `application/octet-stream`                                 |
| **Body**                 | Raw binary audio chunk                                     |
| **Response**             | `204 No Content`                                           |
| **Errors**               | `404` session not found; `410` session has been terminated |

### `DELETE /streaming/sessions/{session_id}` [#delete-streamingsessionssession_id]

Terminates the session cleanly. ML drains the AssemblyAI buffer and closes the streaming response.

|              |                         |
| ------------ | ----------------------- |
| **Auth**     | `serviceAuth`           |
| **Response** | `204 No Content`        |
| **Errors**   | `404` session not found |

### Streaming Response Envelope [#streaming-response-envelope]

ML writes NDJSON events to the open response stream. Each newline-terminated line is one event.

```json
{ "type": "transcript_segment", "data": { "segment_id": "<uuid>", "speaker_label": "Speaker A", "text": "Hello world", "start_ms": 1200, "end_ms": 2400, "confidence": 0.96, "is_final": true } }
{ "type": "feature_vector",     "data": { "segment_id": "<uuid>", "vector": [0.12, -0.34, 0.77] } }
{ "type": "speaker_match",      "data": { "segment_id": "<uuid>", "person_id": "<uuid>", "score": 0.91 } }
{ "type": "talking_point",      "data": { "id": "<uuid>", "content": "Discussed Q3 roadmap", "is_final": false, "segments": ["<uuid>"] } }
{ "type": "task",               "data": { "id": "<uuid>", "content": "Follow up with design team", "source": "system" } }
```

Core routes each event type to the appropriate handler:

| Event type           | Core action                                                                |
| -------------------- | -------------------------------------------------------------------------- |
| `transcript_segment` | Broadcast `TranscriptSegmentEvent` on WS → async DB insert                 |
| `feature_vector`     | Async DB update on segment (no WS broadcast)                               |
| `speaker_match`      | Async DB update → broadcast `EntityChanged { entity: transcript_segment }` |
| `talking_point`      | Broadcast `TalkingPointEvent` on WS → async DB upsert                      |
| `task`               | Broadcast `TaskEvent` on WS → async DB insert                              |

***


# Full Meeting Experience (/docs/work/delivered/live-capture/03-tdd/milestones/full-experience)


# Milestone 3: Full Meeting Experience [#milestone-3-full-meeting-experience]

> **Integration Lead**: App engineer
> **Combines**: Milestone 2 + Slice 5 (App)

The complete meeting recording experience: record → live insights → post-meeting reprocessing → play back audio with synchronised transcript. The bet is complete when all 12 user stories pass.

**End-to-end tests:**

| Test                                    | Assertion                                                                             |
| --------------------------------------- | ------------------------------------------------------------------------------------- |
| Full upload → reprocess → playback flow | User records meeting, waits for reprocessing, plays audio, transcript highlights sync |
| Transcript click-to-seek                | Clicking a transcript segment seeks audio to that position                            |
| Playback speed control                  | Audio plays at 0.5×, 1×, 1.5×, 2× — transcript sync maintains accuracy at all speeds  |
| Speaker names in playback               | Resolved person names display on transcript segments during playback                  |
| All 12 user stories pass                | Manual walkthrough of every acceptance criterion from the [User Flow](user-flow)      |

***


# App — Playback UI (/docs/work/delivered/live-capture/03-tdd/milestones/full-experience/slice-app)


# Slice 5: App — Playback UI + Transcript Synchronisation [#slice-5-app--playback-ui--transcript-synchronisation]

> **Owner**: App engineer
> **Domain**: App
> **Complexity**: M
> **Status**: 🔧 In Progress
> **Prerequisite**: Slice 4 merged

Audio playback with synchronised transcript highlighting.

### Tasks [#tasks]

* [x] Core: `GET /meetings/{id}/audio-url` and `GET /meetings/{id}/audio` (dev proxy)
* [x] App: `AudioPlayer` component — play/pause, ±10s skip, seek bar, playback speed
* [x] App: transcript sync — highlight segment where `start_ms ≤ currentTime < end_ms`
* [x] App: click-to-seek — clicking segment sets audio position
* [x] App: display resolved person names on segments
* [ ] App: auto-scroll — keep active transcript segment in view
* [ ] App: handle audio URL expiry — re-fetch before expiry
* [ ] App: regenerate OpenAPI client to include `person_id` on `TranscriptionSegment`

**Test cases:**

| Test                                     | Location      | Assertion                                                                  |
| ---------------------------------------- | ------------- | -------------------------------------------------------------------------- |
| Audio player renders with controls       | `test_app`    | Play/pause, skip, speed, and seek bar visible                              |
| Transcript highlight syncs with playback | `test_app`    | Segment with matching time range has active style                          |
| Click-to-seek works                      | `test_app`    | Clicking timestamp sets `audio.currentTime` to segment's `start_ms / 1000` |
| Auto-scroll follows playback             | `test_app`    | Active segment scrolled into viewport during playback                      |
| Speaker names displayed                  | `test_app`    | Segment with `person_id` shows person name, not speaker label              |
| Full playback flow end-to-end            | `test_system` | User opens completed meeting, plays audio, sees synced transcript          |


# Full Pipeline Operational (/docs/work/delivered/live-capture/03-tdd/milestones/full-pipeline)


# Milestone 2: Full Pipeline Operational [#milestone-2-full-pipeline-operational]

> **Integration Lead**: Core engineer
> **Combines**: Milestone 1 + Slice 4 (Core + ML)

After recording stops, the system automatically re-processes the meeting: higher-accuracy transcript replaces the live version, talking points are finalised, headline is generated, and speakers are identified by voice profile matching.

**End-to-end tests:**

| Test                                | Assertion                                                                                       |
| ----------------------------------- | ----------------------------------------------------------------------------------------------- |
| Post-meeting reprocessing completes | After recording stops, transcript segments are replaced with higher-accuracy version within 60s |
| Talking points finalised            | Talking points show `is_final: true` after reprocessing                                         |
| Headline generated automatically    | Meeting has a non-null `headline` after reprocessing                                            |
| Live tasks preserved                | Both user-created and system-extracted tasks from the live session remain unchanged             |
| Speaker identification              | Segments attributed to enrolled voice profiles show person names                                |
| UI updates in real time             | Each reprocessed artefact appears in the UI without page refresh (WebSocket events)             |

***


# Core — Post-Meeting (/docs/work/delivered/live-capture/03-tdd/milestones/full-pipeline/slice-core)


# Slice 4: Post-Meeting Reprocessing + Speaker ID [#slice-4-post-meeting-reprocessing--speaker-id]

> **Owner**: Core engineer
> **Domain**: Core
> **Complexity**: M
> **Status**: ✅ Done

The automatic pipeline that runs after recording stops.

### Tasks [#tasks]

* [x] publish `TranscriptionJobMessage` and `MeetingTerminatedMessage` on session stop
  consume `MeetingSessionTerminated` — drain AssemblyAI buffer; send final segments via REST
* [x] ML: skip task extraction when existing live tasks present
* [x] Core: `POST /transcriptions/{id}/remap-speaker` and `POST /transcriptions/{id}/identify-speakers`
* [x] ML: speaker centroid computation + identify + remap pipeline

**Test cases:**

| Test                                         | Location      | Assertion                                                                  |
| -------------------------------------------- | ------------- | -------------------------------------------------------------------------- |
| Post-meeting pipeline triggers automatically | `test_system` | Recording stop → transcript segments replaced with higher-accuracy version |
| Tasks from live session preserved            | `test_system` | After post-meeting processing, user-created and system tasks still present |
| Speaker labels resolved to person\_id        | `test_core`   | Cosine similarity match → segment `person_id` updated                      |
| Talking points promoted to `is_final: true`  | `test_system` | After post-meeting processing, talking points have `is_final: true`        |
| Headline generated                           | `test_system` | Meeting has non-null `headline` after post-meeting processing              |

***


# ML — Post-Meeting (/docs/work/delivered/live-capture/03-tdd/milestones/full-pipeline/slice-ml)


# Slice 4: Post-Meeting Reprocessing + Speaker ID [#slice-4-post-meeting-reprocessing--speaker-id]

> **Owner**: ML engineer
> **Domain**: ML
> **Complexity**: M
> **Status**: ✅ Done

The automatic pipeline that runs after recording stops.

### Tasks [#tasks]

publish `TranscriptionJobMessage` and `MeetingTerminatedMessage` on session stop

* [x] consume `MeetingSessionTerminated` — drain AssemblyAI buffer; send final segments via REST
* [x] ML: skip task extraction when existing live tasks present
* [x] Core: `POST /transcriptions/{id}/remap-speaker` and `POST /transcriptions/{id}/identify-speakers`
* [x] ML: speaker centroid computation + identify + remap pipeline

**Test cases:**

| Test                                         | Location      | Assertion                                                                  |
| -------------------------------------------- | ------------- | -------------------------------------------------------------------------- |
| Post-meeting pipeline triggers automatically | `test_system` | Recording stop → transcript segments replaced with higher-accuracy version |
| Tasks from live session preserved            | `test_system` | After post-meeting processing, user-created and system tasks still present |
| Speaker labels resolved to person\_id        | `test_core`   | Cosine similarity match → segment `person_id` updated                      |
| Talking points promoted to `is_final: true`  | `test_system` | After post-meeting processing, talking points have `is_final: true`        |
| Headline generated                           | `test_system` | Meeting has non-null `headline` after post-meeting processing              |

***


# Live Recording Operational (/docs/work/delivered/live-capture/03-tdd/milestones/live-recording)


# Milestone 1: Live Recording Operational [#milestone-1-live-recording-operational]

> **Integration Lead**: App engineer (closest to the user)
> **Combines**: Slice 1 (Core) + Slice 2 (ML) + Slice 3 (App)

The user can start a recording, watch live transcription and AI insights stream in, add manual tasks, and stop the recording. The core value proposition of the bet is functional.

**End-to-end tests:**

| Test                          | Assertion                                                                                                        |
| ----------------------------- | ---------------------------------------------------------------------------------------------------------------- |
| Full live recording flow      | User starts recording → sees live transcript → sees talking points → adds task → stops recording → meeting saved |
| Multi-speaker recording       | Recording with 2+ speakers shows distinct speaker labels on segments                                             |
| Graceful degradation          | With ML unavailable: recording continues, audio captured, degraded banner shown                                  |
| Duration limit enforcement    | After configurable limit, recording auto-stops and post-processing triggers                                      |
| Concurrent session prevention | Starting a second recording while one is active → error shown                                                    |

***


# App — Live UI (/docs/work/delivered/live-capture/03-tdd/milestones/live-recording/slice-app)


# Slice 3: App — Live Recording UI + Audio Streaming [#slice-3-app--live-recording-ui--audio-streaming]

> **Owner**: App engineer
> **Domain**: App
> **Complexity**: L
> **Prerequisite**: Slice 1 merged and `./dev gen all` run

The user-facing live recording experience.

### Tasks [#tasks]

* [x] Add recording controls to Meeting Detail (Record / Stop button, state-aware)
* [x] Implement microphone capture: `MediaRecorder` → binary chunks → WebSocket
* [x] Handle `RecordingStartedEvent`, `RecordingStoppedEvent`, `RecordingDegradedEvent`
* [x] Render live transcript, talking points, and tasks
* [x] Task input during recording: optimistic mutation with echo suppression
* [x] Graceful error handling: ML unavailable, session already active

**Test cases:**

| Test                                   | Location      | Assertion                                                                     |
| -------------------------------------- | ------------- | ----------------------------------------------------------------------------- |
| Start recording → indicator visible    | `test_app`    | Recording indicator component renders when session active                     |
| Live transcript renders final segments | `test_app`    | Final segment appears in transcript list                                      |
| Interim segments visually distinct     | `test_app`    | Interim segment has `opacity` or `muted` style                                |
| Task creation is optimistic            | `test_app`    | Task appears in list before server response                                   |
| Echo suppression works                 | `test_app`    | WebSocket event with matching `sourceClientId` is ignored                     |
| Degraded mode shows banner             | `test_app`    | `RecordingDegradedEvent` → warning banner visible                             |
| Full live flow end-to-end              | `test_system` | User starts recording, sees transcript, adds task, stops — all data persisted |

***


# Core — Schema, Endpoints, Events (/docs/work/delivered/live-capture/03-tdd/milestones/live-recording/slice-core)


# Slice 1: Core — Schema, Endpoints, WebSocket Events [#slice-1-core--schema-endpoints-websocket-events]

> **Owner**: Core engineer
> **Domain**: Core
> **Complexity**: L

The foundation everything else builds on. No UI can be built until these contracts exist and the API spec is regenerated.

### Tasks [#tasks]

* [x] Write and apply DB migrations (UUIDv7, `speaker_label`, `start_ms`/`end_ms`, `person_id`, `is_final`, `headline`, `status`)
* [x] Implement `GET /meetings/{id}/audio-url` — returns signed GCS URL
* [x] Implement `POST /meetings/{id}/speaker-labels` — bulk person\_id assignment + voice profile enrichment
* [x] Implement `PUT /transcriptions/{id}/segments` — atomic replacement
* [x] Add new WebSocket events: `RecordingStartedEvent`, `RecordingStoppedEvent`, `RecordingDegradedEvent`, `TalkingPointEvent`, `TaskEvent`
* [x] Implement ML streaming session management and audio GCS upload stream
* [x] Regenerate API spec: `./dev gen all`

**Test cases:**

| Test                                         | Location    | Assertion                                                      |
| -------------------------------------------- | ----------- | -------------------------------------------------------------- |
| Migrations apply cleanly                     | `test_core` | `./dev db migrate` succeeds with no errors                     |
| `GET /audio-url` returns signed URL          | `test_core` | 200 with `url` and `expires_at` fields                         |
| `POST /speaker-labels` updates segments      | `test_core` | All segments with matching `speaker_label` receive `person_id` |
| `PUT /segments` replaces atomically          | `test_core` | New segments replace old; count matches input                  |
| WebSocket events conform to CloudEvents v1.0 | `test_core` | Events include `specversion`, `id`, `source`, `type`, `time`   |
| Unauthenticated requests rejected            | `test_core` | 401 for all new endpoints without auth                         |

***


# ML — Streaming API (/docs/work/delivered/live-capture/03-tdd/milestones/live-recording/slice-ml)


# Slice 2: ML — Streaming Session API + NDJSON Routing [#slice-2-ml--streaming-session-api--ndjson-routing]

> **Owner**: ML engineer
> **Domain**: ML
> **Complexity**: L

Introduces the resource-oriented streaming session API and the NDJSON event stream that Core consumes.

### Tasks [#tasks]

* [x] Implement `POST /streaming/sessions` — response body remains open as `application/x-ndjson`
* [x] Implement `POST /streaming/sessions/{id}/audio` — audio chunk delivery
* [x] Implement `DELETE /streaming/sessions/{id}` — clean session termination with AssemblyAI buffer drain
* [x] Add NDJSON event types: `transcript_segment`, `feature_vector`, `speaker_match`, `talking_point`, `task`
* [x] Implement speaker identification logic and task extraction (\~60s cadence)
* [x] Extend post-meeting pipeline to honour `skip_tasks` flag
* [x] Regenerate API spec: `./dev gen all`

**Test cases:**

| Test                                   | Location  | Assertion                                                       |
| -------------------------------------- | --------- | --------------------------------------------------------------- |
| Session lifecycle works end-to-end     | `test_ml` | Create → audio → events stream → delete completes without error |
| All 5 NDJSON event types emitted       | `test_ml` | Each event type appears in the stream with correct envelope     |
| `skip_tasks=true` preserves live tasks | `test_ml` | Post-meeting pipeline does not call task extraction             |
| Session conflict returns 409           | `test_ml` | Second `POST /streaming/sessions` for same meeting → 409        |

***


# Cloud Storage Schema (/docs/work/delivered/live-capture/03-tdd/schemas/core/gcs)


# Cloud Storage Schema [#cloud-storage-schema]

Bucket: `wordloop-meeting-audio`
Path format: `meetings/{meeting_id}/{uuid}.{ext}`


# PostgreSQL Schema (/docs/work/delivered/live-capture/03-tdd/schemas/core/postgres)


# PostgreSQL Schema [#postgresql-schema]

Target state of every table involved in this bet after all migrations are applied.

### `meetings` [#meetings]

```sql
CREATE TABLE meetings (
    id               UUID        NOT NULL DEFAULT uuidv7(),
    user_id          UUID        NOT NULL,
    title            TEXT        NOT NULL,
    headline         TEXT,                          -- ML-generated one-line summary
    source_type      TEXT        NOT NULL,
    start_time       TIMESTAMPTZ NOT NULL,
    end_time         TIMESTAMPTZ,
    summary          TEXT,
    key_points       JSONB,
    created_at       TIMESTAMPTZ NOT NULL DEFAULT now(),
    deleted_at       TIMESTAMPTZ,

    CONSTRAINT pk_meetings               PRIMARY KEY (id),
    CONSTRAINT fk_meetings_user          FOREIGN KEY (user_id)
        REFERENCES users (id) ON DELETE CASCADE,
    CONSTRAINT chk_meetings_source_type  CHECK (source_type IN ('recording', 'upload', 'text', 'anecdotal')),
    CONSTRAINT chk_meetings_timeline     CHECK (end_time IS NULL OR end_time > start_time)
);
```

### `meeting_audio_files` [#meeting_audio_files]

```sql
CREATE TABLE meeting_audio_files (
    meeting_id       UUID        NOT NULL,
    gcs_path         TEXT,
    status           TEXT        NOT NULL,
    created_at       TIMESTAMPTZ NOT NULL DEFAULT now(),
    updated_at       TIMESTAMPTZ NOT NULL DEFAULT now(),

    CONSTRAINT pk_meeting_audio_files    PRIMARY KEY (meeting_id),
    CONSTRAINT fk_meeting_audio_meeting  FOREIGN KEY (meeting_id)
        REFERENCES meetings (id) ON DELETE CASCADE,
    CONSTRAINT chk_meeting_audio_status  CHECK (
        status IN ('recording', 'processing', 'completed', 'failed')
    ),
    CONSTRAINT chk_meeting_audio_path    CHECK (
        status IN ('processing', 'failed') OR gcs_path IS NOT NULL
    )
);
```

### `transcript_segments` [#transcript_segments]

```sql
CREATE TABLE transcript_segments (
    id               UUID        NOT NULL DEFAULT uuidv7(),
    transcription_id UUID        NOT NULL,
    speaker_label    TEXT,
    person_id        UUID,
    text             TEXT        NOT NULL,
    start_ms         BIGINT      NOT NULL,
    end_ms           BIGINT      NOT NULL,
    confidence       REAL,
    is_final         BOOLEAN     NOT NULL DEFAULT false,
    is_highlighted   BOOLEAN     NOT NULL DEFAULT false,
    feature_vector   vector(512),
    created_at       TIMESTAMPTZ NOT NULL DEFAULT now(),

    CONSTRAINT pk_transcript_segments    PRIMARY KEY (id),
    CONSTRAINT fk_segments_transcription FOREIGN KEY (transcription_id)
        REFERENCES transcriptions (id) ON DELETE CASCADE,
    CONSTRAINT fk_segments_person        FOREIGN KEY (person_id)
        REFERENCES people (id) ON DELETE SET NULL,
    CONSTRAINT chk_segments_timeline     CHECK (end_ms > start_ms),
    CONSTRAINT chk_segments_confidence   CHECK (confidence IS NULL OR (confidence >= 0 AND confidence <= 1))
);
```

### `talking_points` [#talking_points]

```sql
CREATE TABLE talking_points (
    id         UUID        NOT NULL DEFAULT uuidv7(),
    meeting_id UUID        NOT NULL,
    topic_id   UUID,
    content    TEXT        NOT NULL,
    is_final   BOOLEAN     NOT NULL DEFAULT false,
    created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
    deleted_at TIMESTAMPTZ,

    CONSTRAINT pk_talking_points         PRIMARY KEY (id),
    CONSTRAINT fk_talking_points_meeting FOREIGN KEY (meeting_id)
        REFERENCES meetings (id) ON DELETE CASCADE,
    CONSTRAINT fk_talking_points_topic   FOREIGN KEY (topic_id)
        REFERENCES topics (id) ON DELETE SET NULL
);
```