Data minimisation, GDPR, PII handling, and data residency for a platform that processes meeting audio.

Privacy

TL;DR

We handle private audio from users who trusted us with it. Our privacy stance is that we only collect what we need, keep it only as long as we need it, expose it only where it is needed, and let users see, correct, and remove their own data on demand. Privacy is a design input, not a compliance appendage.

Why this matters

A privacy failure at a company like ours is not a regulatory inconvenience — it is a direct breach of the most sensitive interaction a user has with our product. A recording of someone's meeting, a transcript of a difficult conversation, a recap that includes names and numbers — none of this has the same forgiveness curve as a leaked login. Privacy has to be thought about at design time, because once the data exists in the wrong shape or the wrong place, remediation is punishingly expensive.

Our principles

1. Collect the minimum

For every field we capture, we ask: do we actually need this to deliver the user's outcome? Email for authentication, audio for transcription, participation records for collaboration — yes. Browser fingerprint for "analytics" — almost never. Data minimisation reduces both privacy risk and operational complexity.

2. Retain for a bounded time

Every category of data has an explicit retention policy set at collection time. Audio is transcribed and then deleted unless the user has opted into retention; Transcriptions follow the Meeting's retention policy; derived embeddings carry the shortest retention of the source. "We keep it forever" is never the answer; expired data is deleted by automation, not by a Tuesday-afternoon cron.

3. Access is scoped and audited

Every internal access to user data is authenticated, authorised, and logged. Engineers cannot browse production data casually; support staff cannot read a transcript without a clear business reason and an auditable access record. Unsupervised access is a policy failure waiting to be discovered.

4. Users see, control, and remove their data

Data subject rights — access, rectification, portability, deletion — are first-class features, not regulatory bolt-ons. A user's deletion request flows through the same plumbing as retention expiry: structured, automated, and verifiable. A deletion that leaves "just this one copy" around is a promise broken.

5. Design for data residency

Where data lives matters — both for regulation (EU user data must stay on EU infrastructure for some purposes) and for user expectation. Residency is a design input to storage and pipeline choices, not an afterthought discovered during procurement.

6. PII is handled distinctly from content

Email addresses, names, IPs — PII has a shorter retention, tighter access controls, and is explicitly not co-located with content where we can help it. The treat-all-data-the-same approach makes the problems of the most sensitive fields become the problems of every field.

7. Model training respects user choice

User data is used to train or evaluate models only when the user has given informed consent, and the consent record is auditable. Assuming consent because "everyone does" is not a posture we hold.

8. Privacy reviews happen before launch

Every feature that touches user data has a privacy review before it ships — the same rhythm as a security review, often in the same meeting. The reviewer asks the specific questions a regulator or an investigative journalist would, and the answers go on the record. "We will do the privacy review after launch" is a commitment that never gets honoured.

How we apply this

Data Engineering — retention and contract discipline.
Security — the perimeter that privacy relies on.
Postgres — retention enforced at the storage layer.
Operations — the incident response for a privacy event.

Anti-patterns we reject

"Privacy is the lawyers' job." By the time the lawyers are involved, the damage is done. Privacy is an engineering discipline.
Retention by default to forever. Growing tables nobody cleans are ticking privacy incidents.
Development data scraped from production. A dev environment with a sample of real user transcripts is a breach waiting to be noticed.
Analytics as a free pass. "It is for analytics" is not a sufficient justification for collecting a piece of PII. The same bar applies.
PII in logs. Trace and log data routinely outlives the systems that produced it. PII does not belong there.
Consent-by-omission. Checking a "we may use your data to improve the model" box buried in a ToS is not consent.

Privacy

Privacy

TL;DR

Why this matters

Our principles

1. Collect the minimum

2. Retain for a bounded time

3. Access is scoped and audited

4. Users see, control, and remove their data

5. Design for data residency

6. PII is handled distinctly from content

7. Model training respects user choice

8. Privacy reviews happen before launch

How we apply this

Anti-patterns we reject

Further reading

On this page