Transcript processing lifecycle and segments — CRUD, live streaming events, batch processing, ML write-back, and Pub/Sub trigger.

Transcription

A transcription tracks the processing lifecycle for a meeting's audio. Each meeting has at most one transcription. Transcript segments are the individual speaker-attributed text fragments produced during live recording and refined during post-meeting batch processing. For shared semantics, see Infrastructure.

Resource Shapes

Transcription

{
  "id": "transcription-uuid",
  "meeting_id": "meeting-uuid",
  "status": "transcribing",
  "status_message": "Batch transcription in progress",
  "progress_percent": 45,
  "is_degraded": false,
  "created_at": "2026-05-01T09:00:00Z",
  "updated_at": "2026-05-01T10:01:00Z"
}

Valid statuses: pending, transcribing, synthesizing, completed, failed.

pending — created but processing has not started (e.g., waiting for audio upload or first byte of live audio).
transcribing — batch transcription and diarisation are in progress.
synthesizing — transcript is complete; headline, summary, topics, and talking points are being generated.
completed — all artefacts are final.
failed — processing failed; status_message carries the reason.

Transcript Segment

{
  "id": "segment-uuid",
  "source_sequence": 1842,
  "revision": 2,
  "speaker_label": "speaker_1",
  "person_id": "person-uuid",
  "text": "Let's follow up tomorrow.",
  "start_ms": 183900,
  "end_ms": 185100,
  "confidence": 0.94,
  "is_final": true,
  "feature_vector": [0.12, -0.34]
}

source_sequence is assigned by ML as a monotonic counter per transcription session. It is independent of the audio chunk sequence number — the relationship between audio chunks and transcript segments is not 1:1 (one chunk may produce zero or multiple segments). Deduplication uses (transcription_id, source_sequence, revision).

REST API

`GET /meetings/{id}/transcriptions`

Lists transcriptions for a meeting (currently always 0 or 1).


Auth	`bearerAuth`
Response	`200 TranscriptionList`

`GET /transcriptions/{id}`

Returns transcription metadata and processing status.


Auth	`bearerAuth`
Response	`200 Transcription`
Errors	`404` transcription not found

`GET /transcriptions/{id}/segments`

Returns transcript segments with cursor-based pagination. Supports time-range filtering for audio-synced views and ML context recovery.


Auth	`bearerAuth` or service auth
Response	`200 TranscriptSegmentList`
Query params	`cursor`, `limit` (default 100, max 500), `after_ms`, `before_ms`, `is_final`

The after_ms and before_ms parameters filter by segment start_ms, enabling ML to fetch recent segments for LLM context recovery after a pod restart.

`POST /transcriptions/{id}/segments` — ML Write-Back

Appends live transcript segments during an active session. Used for low-latency durable writes.


Auth	service auth
Idempotency	De-duplicates by `(transcription_id, source_sequence, revision)` — `source_sequence` is ML-assigned (monotonic per session), not the audio chunk sequence number
Response	`204 No Content`
Side effects	Broadcasts `TranscriptSegmentEvent` for live clients and `EntityChangedEvent { entity: "transcript_segment" }` for cache revalidation

{
  "segments": [
    {
      "id": "segment-uuid",
      "source_sequence": 1842,
      "revision": 1,
      "speaker_label": "speaker_1",
      "person_id": null,
      "text": "Let's follow up tomorrow.",
      "start_ms": 183900,
      "end_ms": 185100,
      "confidence": 0.94,
      "is_final": true
    }
  ]
}

`PUT /transcriptions/{id}/segments` — ML Write-Back

Atomically replaces all transcript segments after batch transcription completes. This is the post-meeting quality pass — a new transcript version.


Auth	service auth
Idempotency	Required
Response	`204 No Content`
Errors	`404` transcription not found; `409` live session still active
Side effects	Broadcasts `TranscriptRevisedEvent` (not `EntityChangedEvent`) — clients must reload the full segment list

`PATCH /transcriptions/{id}/status` — ML Write-Back

Updates processing state for the Meeting Summary progress indicator.


Auth	service auth
Response	`204 No Content`
Side effects	Inserts `transcription_status_history` row; broadcasts `EntityChangedEvent { entity: "transcription", action: "updated" }`

{
  "status": "synthesizing",
  "message": "Generating summary and talking points",
  "progress_percent": 75
}

Real-Time Events

Core → Browser

`TranscriptSegmentEvent`

Carries a full segment for immediate rendering. Interim segments are replaced in-place by later events with the same id and a higher revision.

{
  "specversion": "1.0",
  "id": "event-uuid",
  "source": "wordloop-core/ws",
  "type": "com.wordloop.transcript.segment.v1",
  "time": "2026-05-01T09:03:05Z",
  "traceparent": "00-...",
  "data": {
    "meeting_id": "meeting-uuid",
    "transcription_id": "transcription-uuid",
    "segment": {
      "id": "segment-uuid",
      "revision": 2,
      "source_sequence": 1842,
      "speaker_label": "speaker_1",
      "person_id": null,
      "text": "Let's follow up tomorrow.",
      "start_ms": 183900,
      "end_ms": 185100,
      "confidence": 0.94,
      "is_final": true
    }
  }
}

`TranscriptRevisedEvent` — New

Signals that the entire transcript has been replaced by a post-meeting quality pass. Clients must reload the full segment list via GET /transcriptions/{id}/segments. This replaces the ambiguous EntityChangedEvent { entity: "transcript_segment" } for the bulk replacement case.

{
  "specversion": "1.0",
  "id": "event-uuid",
  "source": "wordloop-core/ws",
  "type": "com.wordloop.transcript.revised.v1",
  "time": "2026-05-01T10:05:00Z",
  "traceparent": "00-...",
  "data": {
    "meeting_id": "meeting-uuid",
    "transcription_id": "transcription-uuid",
    "segment_count": 812,
    "version": 2
  }
}

ML Integration

ML → Core

WebSocket: `TranscriptSegmentProducedEvent`

Emits an interim or final transcript segment. Core immediately fans this out to the app via WebSocket and persists it through Core REST/domain services.

{
  "specversion": "1.0",
  "id": "event-uuid",
  "source": "wordloop-ml/ws",
  "type": "com.wordloop.ml.transcript.segment.v1",
  "time": "2026-05-01T09:03:05Z",
  "traceparent": "00-...",
  "data": {
    "meeting_id": "meeting-uuid",
    "transcription_id": "transcription-uuid",
    "segment": {
      "id": "segment-uuid",
      "source_sequence": 1842,
      "revision": 1,
      "speaker_label": "speaker_1",
      "person_id": null,
      "text": "Let's follow up tomorrow.",
      "start_ms": 183900,
      "end_ms": 185100,
      "confidence": 0.94,
      "is_final": true
    }
  }
}

WebSocket: `SegmentFeaturesProducedEvent`

Sends feature vectors for speaker matching and later voice-profile enrichment. Core persists vectors but does not broadcast them to the browser. For how these feed the speaker identification pipeline, see Person & Speaker Identity.

{
  "specversion": "1.0",
  "id": "event-uuid",
  "source": "wordloop-ml/ws",
  "type": "com.wordloop.ml.segment_features.v1",
  "time": "2026-05-01T09:03:06Z",
  "traceparent": "00-...",
  "data": {
    "meeting_id": "meeting-uuid",
    "segment_id": "segment-uuid",
    "speaker_label": "speaker_1",
    "embedding_model": "ecapa-tdnn-v1",
    "embedding": [0.12, -0.34]
  }
}

ML Batch Processing

Batch processing handles post-meeting transcription and synthesis. Pub/Sub is the normal trigger; REST provides a deterministic control surface for Core and tests.

`POST /transcription-jobs/{id}/run`

Starts or resumes a post-meeting transcription job.


Auth	service auth
Idempotency	Required
Response	`202 Accepted` with job status
Errors	`404` job unknown; `409` job already running with a different audio version

{
  "meeting_id": "meeting-uuid",
  "transcription_id": "transcription-uuid",
  "storage_path": "gs://wordloop-audio/meetings/meeting-uuid/audio.webm",
  "audio_version": 2,
  "task_extraction_policy": "skip",
  "speaker_profile_policy": "enrich_after_completion"
}

`GET /transcription-jobs/{id}`

Returns ML job progress for diagnostics. Core remains the user-facing source of truth for transcription status.


Auth	service auth
Response	`200 MLTranscriptionJobStatus`

{
  "id": "transcription-uuid",
  "meeting_id": "meeting-uuid",
  "status": "transcribing",
  "progress_percent": 45,
  "current_stage": "batch_transcription",
  "started_at": "2026-05-01T10:01:00Z",
  "completed_at": null
}

Pub/Sub

`transcription-jobs`

Dispatches batch transcription and synthesis work to ML after an audio upload completes or a live recording has composed audio.webm. This is the single actionable trigger for post-meeting processing.


Producer	Core
Consumer	ML post-meeting worker
CloudEvents type	`com.wordloop.transcription.requested.v1`
Ordering key	`meeting_id`
Idempotency	`transcription_id` plus `audio_version`
Dead-letter	`transcription-jobs-dlq`

{
  "transcription_id": "transcription-uuid",
  "meeting_id": "meeting-uuid",
  "user_id": "user-uuid",
  "storage_path": "gs://wordloop-audio/meetings/meeting-uuid/audio.webm",
  "audio_version": 2,
  "source_type": "live",
  "task_extraction_policy": "skip",
  "speaker_profile_policy": "enrich_after_completion"
}

Valid source_type values: upload, live.

Valid task_extraction_policy values: extract, skip, replace_system. Live recordings use skip because tasks captured during the live session are preserved.

Valid speaker_profile_policy values: enrich_after_completion, skip. Controls whether ML updates voice profiles with session embeddings.

Consumer Outcomes

Event	Consumer outcome
`transcription.requested`	ML downloads audio, runs batch transcription/synthesis, writes results to Core REST, and updates status transitions.
`meeting.session.terminated`	ML drains AssemblyAI, flushes final live segments via Core REST, and closes its ML WebSocket connection.

Transcription

On this page