Data Flow
Meeting Recording — system context, per-flow sequence diagrams, and boundary inventory.
Data Flow
For each step in the User Flow, this page draws what calls what: which service initiates, which responds, what data crosses each boundary. Read each arrow two ways: it is a contract boundary (what shape the data takes) and a sequencing constraint (downstream cannot build until the upstream contract is published).
System Context
Flow 1: Start Recording
Flow 2: Live Audio → Transcription (Lowest Latency Path)
Audio flows from the browser microphone through Core and ML to AssemblyAI. Transcript segments return via the streaming HTTP response — the same connection ML uses to receive audio. This is a bidirectional HTTP stream: audio chunks flow upstream, segments and insights flow downstream.
Core streams segments directly to the client via WebSocket for minimum latency, and persists them to the database asynchronously in the background.
Flow 3a: Live Talking Points (Fast — Per Finalised Segment)
Talking points update on every finalised transcript segment. ML streams them back through the same HTTP stream as transcript segments. Core forwards them to the client via WebSocket and persists to the database asynchronously — the same dual-write pattern as transcript segments.
Flow 3b: Live Task Extraction (Slow — Every ~60s)
Task extraction runs on a slower cadence. ML buffers segments and periodically checks for action items. Tasks also stream back through the HTTP stream, following the same dual-write pattern.
Flow 3c: Live Speaker Identification (Per Segment)
Speaker identification is built into the live transcription flow. For every segment, ML extracts a voice embedding and stores it on Core. It then attempts to match the embedding against enrolled voice profiles.
When a user later labels an AssemblyAI speaker label (e.g. "Speaker A") as a known Person, the system uses all segments with that speaker label to enrich that person's voice profile for improved future matching.
Flow 4: User Creates Task During Recording
Standard Optimistic Mutation with Echo-Suppressed Streaming. The user's task is written via REST (not the streaming path) since it's a user-initiated mutation.
Flow 5: User Labels Speaker as Person
When a user identifies "Speaker A" as a known Person, the system enriches that person's voice profile using all segments attributed to that speaker label.
Flow 6: Stop Recording
Flow 7: Post-Meeting Processing (Automatic, via Pub/Sub)
Post-meeting processing runs automatically via the shared TranscriptionJob Pub/Sub worker. For live recordings, the job is published with skip_tasks: true to preserve tasks captured during the session.
The worker:
- Batch-transcribes the full audio from GCS (higher accuracy)
- Replaces transcript segments with the improved results
- Generates headline, summary, topics, and finalises talking points (
is_final: true) - Extracts tasks when
skip_tasks: false(file upload flow only)
Flow 8: Audio Playback (Signed URL Direct to GCS)
Core generates a short-lived signed URL. The client streams audio directly from Cloud Storage using that URL, with standard HTTP range requests for seeking.
Boundary Inventory
Every boundary shown in the diagrams above. Each becomes a contract on the Contracts page.
| Boundary | From → To | Protocol | Data shape |
|---|---|---|---|
| Meeting CRUD | App → Core | REST | POST/PATCH /meetings |
| Recording commands | App → Core | WebSocket | StartRecordingCommand, StopRecordingCommand |
| Audio streaming | App → Core → ML | WebSocket (binary) → HTTP stream | Raw audio chunks |
| Live insights | ML → Core → App | HTTP stream → WebSocket | NDJSON events (5 types) |
| Speaker labels | App → Core | REST | POST /meetings/{id}/speaker-labels |
| Signed URL | App → Core → GCS | REST → GCS signed URL | GET /meetings/{id}/audio-url |
| Post-meeting trigger | Core → ML | Pub/Sub | TranscriptionJob, MeetingSessionTerminated |
| Synthesis write-back | ML → Core | REST | PUT /synthesis, PATCH /meetings, PUT /segments |