We keep embeddings alongside their source rows in Postgres via pgvector rather than running a standalone vector database.

0001 — Postgres with `pgvector` as the production vector store

Status: Accepted Date: 2026-04-19 Deciders: core platform Supersedes: — Superseded by: —

Context

Wordloop generates and stores embeddings for transcript chunks, speaker utterances, and recap summaries. A retrieval-augmented generation (RAG) workflow at read time uses these embeddings to supply context to model calls.

The default instinct when adding a GenAI feature is to reach for a dedicated vector database — Pinecone, Milvus, Weaviate, or similar. These systems offer specialised ANN indexes, horizontal scale, and purpose-built tooling. At our current scale, they also introduce an operational surface we do not need and a split-brain failure mode we actively want to avoid.

Embeddings in Wordloop are not an island. They exist because a specific transcript chunk exists. They must appear atomically with the chunk, be removed atomically when the chunk is removed, and obey the same authorisation rules the chunk does. A system where the transcript lives in Postgres and its embedding lives in a separate service that is updated "eventually" is a system where queries will silently return embeddings for deleted content or miss content that was just created — neither of which is acceptable.

Decision

Use PostgreSQL with the pgvector extension as the single production vector store. Embeddings live on the row they describe (or in a sibling table joined by primary key), committed in the same transaction as their source data.

Consequences

Atomic writes. Inserting a transcript chunk and its embedding happens in one transaction. If the embedding fails to compute or save, the chunk rolls back. There is no asynchronous reconciliation process and no inconsistency window.

One operational surface. The database we already run, already back up, already monitor, already manage migrations for, is also the vector store. No second system to provision, secure, or teach on-call about.

One authorisation model. The row-level security rules that protect transcript data also protect the embeddings. We do not have to re-implement access control in a second system and hope the two models agree.

Adequate performance at current scale. pgvector's IVFFlat and HNSW indexes are sufficient for our current and projected vector counts. We benchmark quarterly; we have not approached the scale where a purpose-built vector database would outperform pgvector by a margin that justifies the operational cost.

Alternatives considered

Pinecone, Milvus, Weaviate. Rejected for the split-brain failure mode and the second operational surface. Revisit if vector count per tenant exceeds ~10M and pgvector benchmarks degrade materially.
Embeddings in a denormalised column with in-Go cosine comparison. Rejected for O(n) query cost — acceptable for small datasets in prototypes, unacceptable in production.
Embeddings in an object store with a hand-rolled ANN index. Rejected for the cost of maintaining the index and the absence of transactional guarantees.

Debt annotation

Principal: None beyond the pgvector extension install, which is a single SQL statement per environment.

Interest: Low. pgvector is actively maintained and widely deployed; index tuning (IVFFlat lists, HNSW ef_construction) is a one-time cost per table.

Multiplier: Vector count per tenant. If a single tenant's embedding set grows beyond the point where pgvector's ANN indexes outperform full scan by a useful margin — empirically, in the tens of millions — revisit this decision. The migration path is well-understood (dual-write, shadow-read, cut over), but non-trivial.

Verification

SELECT extname FROM pg_extension WHERE extname = 'vector'; returns a row on every environment.
Transcript insertion and embedding insertion appear in the same transaction log entry.
No application code writes to an external vector service.

Postgres with pgvector as the production vector store