Overview
Architectural approach, success criteria, key decisions, constraints, and navigation map for the Meeting Recording TDD.
Technical Design Document
Status: Agreed
Author: Ryan Nel
Date: 2026-05-01
Success Criteria
| Criterion | Measured by |
|---|---|
| User can start a live recording from the browser and see real-time transcript within 2 seconds of speech | End-to-end latency from mic input to transcript segment on screen |
| Audio is never lost, even across connectivity failures | Zero-gap rate: all chunks reach GCS via direct upload or OPFS gap recovery |
| Post-meeting artefacts (headline, summary, topics, talking points) reach final quality without user action | Transcription status reaches completed and all synthesis artefacts are present |
| Live session degrades gracefully — audio capture continues even if ML or insights fail | Recording produces a complete audio file even when ML is unavailable for part of the session |
| A single active recording per user at any time | Concurrent session guard enforced client-side and server-side |
Architectural Approach
The system connects browser-captured microphone audio to the existing ML pipeline (AssemblyAI transcription, OpenAI insights) via a streaming architecture with three layers of durability.
Core path: Browser → Core (WebSocket, binary frames) → ML (WebSocket) → AssemblyAI (real-time streaming). Insights flow back: ML → Core → Browser on the same WebSocket connections.
Durability strategy: Audio is captured at three levels simultaneously:
- OPFS shadow buffer — every chunk is written to the browser's Origin Private File System via a dedicated Web Worker before transport. This runs unconditionally.
- GCS chunk storage — each chunk is stored as a separate GCS object keyed by sequence number. Gap recovery backfills any missing chunks from OPFS.
- Post-meeting reprocessing — the composed audio file is batch-transcribed at higher accuracy, replacing live segments entirely.
Key architectural decisions:
- Dual-write (WebSocket for latency, async DB for durability) — a DB hiccup doesn't block the live experience
- Echo-suppressed optimistic mutations — instant UI feedback without double-rendering
- Chunk-based GCS writes with hierarchical compose — enables gap recovery by sequence number; compose at session end
- Sequential post-meeting pipeline — batch transcription must complete before synthesis runs (synthesis depends on final transcript)
- Task preservation — post-meeting processing skips task extraction for live recordings to avoid clobbering user-created tasks
For the full rationale on all decisions, see the Design Decisions table in Data Flow.
Constraints
Architectural constraints discovered during design, in addition to the no-gos in the Pitch:
- Sticky session affinity — all WebSocket frames for a session route to the same Core pod. No pod-to-pod event routing (backplane) exists. This is a known scaling constraint, captured as a problem statement.
- Session not resumable after tab close (v1) — OPFS data persists, but the recording session does not. Tab close ends the session. Session recovery is a separate problem statement.
- Desktop browsers only (this bet) — Chrome/Edge primary, Safari 17+ best-effort. Mobile architecture should not be precluded.
- Single AssemblyAI model — no custom vocabulary or domain-specific tuning.
- 5-minute WebSocket replay buffer — reconnects beyond 5 minutes require full REST re-fetch. Captured as a problem statement.
Open Questions
| Question | Owner | Status |
|---|---|---|
Safari 17+ createSyncAccessHandle() support — confirmed in workers? | App | To verify during milestone 1 |
| GCS compose latency for long recordings (10k+ chunks) — need benchmarks | Core | To measure during milestone 2 |
| AssemblyAI v3 turn-based API migration timeline | ML | Monitoring — no action needed for v1 |
Navigation Map
| Document | What it covers |
|---|---|
| UI Design | Wireframes, screen states, and interaction patterns |
| Data Flow | 20 sequence diagrams across live session, user mutations, and post-meeting processing |
| Contracts | API shapes for every boundary: REST, WebSocket, Pub/Sub, binary audio frames |
| Schemas | Database table designs for Core and ML |
| Milestones | Build plan broken into shippable slices |
Architecture Scaffolding
To maintain structural consistency and immediate integration with the test suite, always use the CLI to generate the remaining TDD components.
Add a Milestone
./dev new milestone meeting-recording <milestone-slug>Add a Domain Slice (Automatically connects to pytest suite)
./dev new slice meeting-recording <milestone-slug> <domain> <slice-slug>Design a Contract
./dev new contract meeting-recording <service> <protocol>Design a Database Schema
./dev new schema meeting-recording <service> <database-tech>