Build document · maximum detail · strict build order · dev-led · 4 reader angles

n-lawOS — the build, end to end

A working engineering document: every step in build order, with the actual data model, contracts, commands, config, failure modes and the reasoning behind each choice. Two columns at a time — click anywhere in either column to switch the whole document between Lay | Dev and Customer | Compliance.

From locked decisions D1–D12 · 13 Jun 2026 · solo/bootstrapped · code/schemas are reference sketches · links canonical (confirm before relying) · not legal advice

👤 LAY🙋 CUSTOMERclick any column ⇄

⚙ DEV⚖ COMPLIANCEclick any column ⇄

§0 · How to read — one doc, four angles

👤 Lay

Two columns. Left = plain words, right = the technical build. Click anywhere in either column (or the bar that follows you down) and both columns switch to a different pair of readers.

⚙ Dev

Default = Lay | Dev; click → Customer | Compliance. Dev is the spine; this is a build-from-it doc — schemas, contracts, commands, env, edge cases, rationale. Order is P0→P9 (real build order). §A data model and §B repo layout are referenced throughout; §C env vars, §D services index at the end.

🙋 Customer (the firm)

What the buying firm gets and what changes for them at this step — value, control, what stays in their hands.

⚖ Compliance

The data-protection / regulatory consequence of this step — lawful basis, where data sits, what a regulator or auditor would check. Grounded in D3/D4. Not legal advice.

Print/PDF shows all four angles at once. Selecting text won't trigger a switch — only a plain click does.

§A · Core data model (the one Postgres spine)

👤 Lay

Everything the firm has — clients, calls, notes, time, the record of who did what — lives in one organised filing system, with strict rules on who can open which drawer.

⚙ Dev

One Postgres database. Row-Level Security on every table from migration #1. pgvector for semantic search. Drizzle schema + SQL migrations in infra/migrations. Core tables:

-- the spine (abridged; all have created_at, updated_at, RLS) tenant(id, name, region) -- 1 row in single-tenant; many at Layer 2 identity(id, email, role, idp_sub) -- role: admin|feeearner|reviewer|viewer matter(id, ref, client_name, status, owner_id) party(id, matter_id, kind, display_name, embedding vector(1536)) -- conflicts/search recording(id, matter_id, source, uri, started_at, duration_s, retention_until) transcript(id, recording_id, text, lang, engine) redaction_token(id, scope_id, token, real_value) -- THE map · never leaves tenant · never in audit/telemetry job_run(id, matter_id, kind, tier, model, status, masked_in, json_out, units, tokens_in, tokens_out, edit_distance, created_by) note(id, matter_id, job_run_id, body, state, signed_by, signed_at) -- state: draft|review|signed time_entry(id, matter_id, note_id, actor, actor_tier, units_6min, billable, signed) audit_log(id, actor_type, actor_id, action, object, before, after, tool, ts) -- APPEND-ONLY memory(id, scope, category, body, embedding vector(1536), provenance) usage_event(id, tenant_id, job_kind, tier, tokens, units, ts) -- numbers only → billing

Invariants baked into the schema: (1) redaction_token is readable only by the tenant role, never selected into audit_log/usage_event; (2) audit_log has no UPDATE/DELETE grant (append-only — enforced by a trigger + revoked privileges); (3) job_run.masked_in only ever holds masked text (CI check, P1.4).

🙋 Customer

One trustworthy record of the whole firm. Staff see only what their role allows. The "real names" key-sheet is a separate locked drawer that never leaves your cloud, and the activity log can never be quietly edited.

⚖ Compliance

Access control at the database (RLS), not just the UI — defensible least-privilege. The re-identification key (redaction_token) is isolated and tenant-only, supporting the "de-identified in the model's hands" position (D3). Append-only audit_log = tamper-evidence. One store = one place to evidence retention, SARs and erasure.

-- RLS: a fee-earner sees only matters they're on CREATE POLICY matter_read ON matter FOR SELECT USING (owner_id = current_setting('app.user_id')::uuid OR EXISTS(SELECT 1 FROM matter_member m WHERE m.matter_id = matter.id AND m.user_id = current_setting('app.user_id')::uuid));

§B · Repository & deploy layout

👤 Lay

The whole system lives in one project folder so it's built, tested and shipped as a single installable unit. The secret "recipe" (the AI prompts) lives in a separate, private folder that never ships.

⚙ Dev

# monorepo (Turborepo + pnpm) — ships/escrows as one unit n-lawos/ ├─ apps/web/ # Next.js (App Router) + TS — the OS shell │ ├─ app/(app)/{home,matters,calls,inbox,tasks,time,knowledge,audit,billing}/ │ ├─ lib/{db,sync,auth,mcp-client,relay-client,redactor-client} │ └─ components/ (shadcn/ui) ├─ services/redactor/ # Python FastAPI · Presidio + spaCy + UK recognizers ├─ services/relay/ # LiteLLM config + usage callback (holds the key) ├─ services/transcribe/ # faster-whisper worker ├─ packages/db/ # Drizzle schema + migrations + RLS policies ├─ packages/mcp/ # one MCP server per surface (tool schemas) ├─ packages/jobs/ # pg-boss runner + the job catalogue ├─ packages/shared/ # zod contracts shared by web + services ├─ infra/compose/ # docker-compose.yml + per-service Dockerfiles ├─ infra/migrations/ # SQL └─ .github/workflows/ # CI: semgrep, gitleaks, jest, playwright, invariant-check n-law-engine/ # SEPARATE PRIVATE REPO — prompts + orchestration (the IP) # never in any handover, escrow or customer image

Why a monorepo: one version, one CI, one deployable; the egress-gate invariant can be checked across the whole tree. Why the Engine is separate: the shipped/escrowed image must contain no IP and no back door (D8).

🙋 Customer

You receive one installable system that runs in your cloud. The vendor's "secret sauce" isn't in it and can't be — what you run is auditable, what makes it clever is licensed.

⚖ Compliance

Single deployable = one auditable boundary for a DPIA / escrow review. Engine-repo separation means the customer image is inspectable and contains no hidden data path. CI lives with the code so controls (redact-before-model, secret-scan) are evidenced on every change.

Foundations & repository

monorepo→ shell→ DB+RLS→ sync→ auth→ deploy bundle

Done when — docker compose up brings the stack live on one VM; a seeded user can sign in via the firm's SSO; an empty matter can be created and is visible only to its owner (RLS proven with a second user).

P0.1

Create the monorepo

👤 Lay

Set up the single project that holds the whole system, built and shipped together.

⚙ Dev

pnpm dlx create-turbo@latest n-lawos cd n-lawos && pnpm add -w -D typescript prettier @biomejs/biome # pnpm-workspace.yaml packages: [ "apps/*", "services/*", "packages/*" ]

Strict TS everywhere ("strict": true, noUncheckedIndexedAccess). Shared contracts as zod schemas in packages/shared so the web app and the Python services agree on shapes (generate JSON Schema for the Python side). Engine prompts: git init a second private repo.

🙋 Customer

Nothing visible yet — but it means one clean install later, not a pile of parts to wire up.

⚖ Compliance

One repo = one auditable artefact. Prompts/IP kept out of it from commit one, so escrow/handover never leaks the Engine or a back door.

stackTurborepo pnpm TypeScript

P0.2

App shell & routes

👤 Lay

The window everyone works in — the screens, menus and buttons.

⚙ Dev

pnpm dlx create-next-app apps/web --ts --app --tailwind --eslint pnpm dlx shadcn@latest init

App Router; one route group (app) with a route per surface (see §B tree). Tenant data is client-rendered against the local sync store (P0.4), not SSR'd off-instance, so nothing renders server-side outside the tenant. A thin server layer only sets the RLS request context (P0.5) and proxies to the redactor/relay/MCP. Layout = left surface nav + top bar + content (the windows you saw in v2).

🙋 Customer

One fast app your team logs into, branded to your firm.

⚖ Compliance

No tenant data leaves the instance to render; the UI executes inside the firm's deployment. Server endpoints are thin and logged.

stackNext.js Tailwind shadcn/ui

P0.3

Database, RLS & migrations

👤 Lay

The single filing system, with locked drawers per role.

⚙ Dev

Postgres 16 + CREATE EXTENSION vector. Drizzle schema in packages/db; migrations in infra/migrations. RLS enabled on every table in the first migration — never "add security later". Each request opens a transaction and sets the actor:

SET LOCAL app.user_id = '...'; SET LOCAL app.role = 'feeearner'; -- then all queries run under that identity; RLS does the filtering

Tables per §A. audit_log: revoke UPDATE/DELETE; add a trigger raising on modify. Edge case: background jobs (no human) run as a service role with its own narrow policies.

🙋 Customer

One trustworthy record; staff see only what their role allows; the activity log can't be secretly changed.

⚖ Compliance

Least-privilege enforced at the data layer; append-only audit; single store simplifies retention + SAR + erasure evidence. DPIA references the RLS policy set and the audit trigger.

stackPostgreSQL pgvector Drizzle

P0.4

Local-first sync

👤 Lay

The app keeps working offline and updates live for everyone.

⚙ Dev

ElectricSQL syncs Postgres ↔ a local store in the browser (offline-first, realtime). Define "shapes" (filtered subscriptions) per surface, e.g. matters where owner or member = me. Writes queue locally and replicate on reconnect; conflicts surface to a resolve UI (keep-mine / use-theirs / merge). Replaces Firebase/Firestore — Google-hosted, can't run in-tenant, forbidden.

shape: matter, party, note, time_entry (scoped by RLS) → live to client; offline writes → outbox → replicate

🙋 Customer

No lost work — it queues offline and syncs when you reconnect; teammates' changes appear live.

⚖ Compliance

Sync stays inside the tenant boundary; no third-party (e.g. Google) processor in the data path — a clean line in the DPIA.

stackElectricSQL

P0.5

Identity, SSO & the RLS bridge

👤 Lay

Staff sign in with their normal work login.

⚙ Dev

Auth.js with an OIDC provider = the firm's Microsoft Entra (or Google; Keycloak self-hosted for big firms). On each request, map the session → identity.role → SET LOCAL app.user_id/app.role (the RLS bridge from P0.3). No Firebase auth. Edge cases: leaver = IdP revokes → session dies; role change propagates on next token refresh.

🙋 Customer

Single sign-on with your existing Microsoft/Google accounts — no new passwords; leavers lose access through your normal process.

⚖ Compliance

Identity stays with the firm's IdP; access tied to the firm's joiner/leaver controls. SSO/SAML expected at enterprise tier; every auth event in the audit log.

stackAuth.js Keycloak MS Entra

P0.6

Deploy bundle (in the firm's cloud)

👤 Lay

Package the system so it installs and runs the same inside the firm's own cloud.

⚙ Dev

# infra/compose/docker-compose.yml (services) postgres # the spine + pgvector web # Next.js redactor # Python FastAPI (Presidio) relay # LiteLLM (holds the key) transcribe # faster-whisper worker livekit # calls + recording worker # pg-boss job runner electric # sync

One .env per tenant (§C). Healthchecks + restart policies; nightly pg_dump → encrypted to R2. Single VM (e.g. Hetzner CCX) to start; Kubernetes only at scale. This single-tenant in-tenant deploy is the control behind D6.

🙋 Customer

It runs on your infrastructure — we never host your data; you hold the keys.

⚖ Compliance

Single-tenant, in-tenant deployment is the core D6 control: data never leaves the firm's cloud; vendor is not a host or a joint controller of content.

stackDocker Compose R2 (backups)

The egress gate (redactor) — before any model call

detection sidecar→ UK recognizers→ token map + deny-by-default→ CI invariant

Done when — POST /redact returns masked text + a tenant-only token map; low-confidence spans queue for human confirm; a measured recall figure exists on a representative transcript set; CI fails any model call that bypasses it.

P1.1

Detection sidecar & contract

👤 Lay

The part that finds every personal detail in the text before anything is sent on.

⚙ Dev

Python FastAPI sidecar wrapping Presidio AnalyzerEngine (spaCy en_core_web_lg NER) + custom recognizers (P1.2), ensembled. Stateless HTTP; the only caller is the web server, never the model.

POST /redact # request { "text": "...", "tenant": "...", "minConf": 0.4 } # response { "maskedText": "...spoke to [PERSON_1] at [ADDRESS_1]...", "tokenMap": { "[PERSON_1]": "Ms A. Khan", "[ADDRESS_1]": "14 Elm Court" }, "spans": [{type,start,end,score}], "lowConfidence": [{...}] }

Recall > precision (β=2). Treat vanilla Presidio as a baseline — it misses PII; tuning + custom recognizers lift it materially. Evaluate with presidio-research on representative transcripts; record recall per type.

🙋 Customer

Built to catch names, numbers and IDs before anything leaves — and to ask a human when it's unsure.

⚖ Compliance

Detection quality is measured, not assumed (Q2). Target anonymisation under the ICO "motivated intruder" test, not mere pseudonymisation (D3). The measured residual rate is the evidence the risk is managed; marketing must say "designed to minimise re-identification risk", never "all PII removed".

stackPresidio spaCy eval kit

P1.2

Custom UK recognizers

👤 Lay

Extra catchers for UK things: National Insurance, NHS numbers, postcodes, sort codes, case references.

⚙ Dev

# NHS number: 10 digits, mod-11 checksum PatternRecognizer(supported_entity="NHS_NUMBER", patterns=[Pattern("nhs", r"\b\d{3}\s?\d{3}\s?\d{4}\b", 0.4)], validation=nhs_mod11) # validator promotes score to 0.95 on pass # also: NINO, UK postcode, sort code (nn-nn-nn), case/claim refs, UK mobile

Validators (checksums/format) raise confidence so true hits clear threshold and noise stays low. Store recognizer configs in services/redactor/recognizers/; version them; re-run eval on every change.

🙋 Customer

Tuned for UK legal data, not a generic filter.

⚖ Compliance

Per-recognizer recall figures are the audit evidence for residual re-identification risk; checksum validation reduces both misses and false flags.

stackcustom recognizers

P1.3

Token map (tenant-only) · deny-by-default · human review

👤 Lay

Each detail becomes a placeholder; the "placeholder = real value" sheet never leaves the firm. Anything uncertain goes to a person before sending.

⚙ Dev

Persist the map in redaction_token (scope = job or matter), readable only by the tenant role; never selected into audit_log/usage_event. Deny-by-default: spans with score < threshold are masked and queued (lowConfidence) for a human to confirm/relabel before the relay call proceeds. Structured fields (DOB, NINO columns) masked by rule, not just ML. Re-hydration (P3.2) reads this table firm-side only.

🙋 Customer

Real identities never leave your walls; a person checks anything uncertain before anything is sent.

⚖ Compliance

The token map is the re-identification key — isolating it in-tenant is what lets the outbound text be treated as de-identified (D3). Human-in-loop on low confidence is a documented, evidenced control.

stackPresidio anonymizer

P1.4

CI invariant — no model call without masked text

👤 Lay

A built-in check that makes it impossible to wire the AI to raw text by mistake.

⚙ Dev

# .github/workflows + semgrep rule (abridged) rules: - id: no-raw-text-to-relay patterns: - pattern: relay.complete($X, ...) - pattern-not: relay.complete(maskedText, ...) message: "model call must use maskedText (D3/P1)" severity: ERROR

Plus Gitleaks (secrets) and a runtime assertion in the relay client that rejects input not tagged {masked:true}. Build fails otherwise. This is the machine form of "redactor-before-engine".

🙋 Customer

The "strip first" rule can't be skipped — the build itself enforces it.

⚖ Compliance

Redactor-before-engine is a code-enforced invariant, evidenced on every commit — a strong control narrative for a DPIA/audit.

stackSemgrep Gitleaks

The relay + model (AI substrate)

relay (key+meter)→ ZDR model + tiers→ MCP + swap layer

Done when — a masked string round-trips through the relay to a ZDR model and back; usage is recorded as counts only; a tier→model map is configurable; provider is swappable via one config change; nothing reaches a model except through redaction→relay.

P2.1

The relay — key holder + meter (in-tenant)

👤 Lay

A small piece of our software, inside the firm, that talks to the AI on the firm's behalf and counts what's used.

⚙ Dev

# services/relay — LiteLLM config.yaml (abridged) model_list: - model_name: tier-junior # cheap/fast litellm_params: { model: azure/gpt-4o-mini, api_base: $AZURE_UK } - model_name: tier-paralegal litellm_params: { model: azure/gpt-4o, api_base: $AZURE_UK } - model_name: tier-associate # frontier litellm_params: { model: anthropic/claude-... , api_base: $ANTHROPIC_ZDR } litellm_settings: success_callback: ["custom_usage_logger"] # writes usage_event (counts only) general_settings: master_key: $RELAY_MASTER_KEY # n-law-issued, short-lived, rotated max_budget_per_tenant: $TENANT_CAP

LiteLLM proxy runs in-tenant; holds an n-law-issued, scoped, short-lived key (rotated via heartbeat); per-tenant spend caps. Customers can't bring their own key; the key is never loose in app code. The success callback writes usage_event (tokens, tier, job kind — no content).

🙋 Customer

You never handle AI keys; usage is metered transparently and billed to you; a runaway can't blow the budget (per-tenant cap).

⚖ Compliance

Only de-identified text + counts leave; the relay is in-tenant. Numbers-only telemetry is both the PII-free-telemetry control and the billing meter. Key rotation + caps limit blast radius if a key leaks.

stackLiteLLM

P2.2

The model — zero-retention, UK/EU, tiered

👤 Lay

The actual AI brain — chosen so it keeps nothing and sits in the UK/EU; a cheaper one for simple jobs, a top one for hard drafting.

⚙ Dev

Live: Azure OpenAI UK South (DataZone EU, ZDR on approved access) for junior/paralegal; Anthropic ZDR for associate. Dev/synthetic only: OpenRouter (never live). Tier is chosen by the job (P3). Confirm ZDR eligibility + region pinning on the actual accounts before go-live.

tier-junior → gpt-4o-mini · tier-paralegal → gpt-4o · tier-associate → frontier (Claude). All ZDR, UK/EU.

🙋 Customer

Your (already de-identified) text goes to a named provider under a no-keep, no-train contract in the UK/EU — not a black box.

⚖ Compliance

Gateway treated as a processor under a DPA: zero-retention, no-training, UK/EU residency, no-re-identification clause (D3). OpenRouter never on the live path. Provider swappable as law/guidance moves.

stackAzure OpenAI Anthropic OpenRouter (dev)

P2.3

MCP tool layer + model-swap

👤 Lay

The wiring that lets the AI use each part of the app, and lets us change AI provider without rebuilding.

⚙ Dev

Each surface exposes an MCP server (packages/mcp) with typed tools, e.g. matter.read, note.draft, time.log, conflict.check. Subagents act only through MCP — never raw DB. A thin askModel(tier, messages) interface makes the provider a one-line config switch.

tool note.draft { input:{matterId, maskedTranscript}, output: NoteDraft } askModel(tier:'paralegal', messages) → via redactor-checked relay only

🙋 Customer

The AI helps across the whole app, and you're never locked to one AI vendor.

⚖ Compliance

Two hard chokepoints — MCP for tools, redaction→relay for models — make the data flow auditable and the provider replaceable.

stackMCP MCP repos

The first job — attendance note (walking skeleton closes)

job runner + contract→ note-drafter→ re-hydrate + sign + edit-rate + audit

Done when — a synthetic transcript → masked → drafted note + draft time entry → human signs → filed to a matter with an append-only audit row and a usage event; edit-distance recorded. The full loop works end-to-end on fake data.

P3.1

Job runner + the job contract

👤 Lay

The machinery that runs one defined AI task and produces one finished thing.

⚙ Dev

# packages/jobs — pg-boss; a Job is a bounded contract type Job = { kind: 'attendance_note' | 'summarise' | 'extract_time' | ... tier: 'junior'|'paralegal'|'associate' input: ZodSchema # masked text + matter ctx output: ZodSchema # structured deliverable minutesEquivalent: number # the billed time-equivalent } # attendance_note output NoteDraft = { heading, body, attendees[], timeUnits, citations[] }

Worker pulls from pg-boss → calls askModel via the relay on maskedText → validates output against the zod schema (retry on mismatch) → writes job_run + a draft note/time_entry (placeholders still in). Bounded jobs (not chat) keep the AI inside back-office support.

🙋 Customer

Pick "draft attendance note" → get a finished draft + a time entry. Each job is a defined task, like giving work to a paralegal.

⚖ Compliance

Bounded, named jobs (not freeform advice) keep the AI on the lawful side of the reserved-activity line (D4); each new job kind gets a regulatory check (Q1).

stackpg-boss zod

P3.2

Re-hydrate · review · sign · edit-rate · audit

👤 Lay

The placeholders are swapped back to real names (only on the firm's side), a person reads, edits and signs; nothing is final until they do; every step is recorded permanently.

⚙ Dev

# on sign — one transaction rehydrate(note.body, tokenMap) # firm-side only note.state = 'signed'; note.signed_by = user job_run.edit_distance = diff(aiDraft, finalText) # quality metric insert audit_log(action:'note.sign', before, after, tool:'attendance_note') time_entry.signed = true

State machine draft → review → signed. Edit-distance is captured per job kind (D10.4) — the running quality + honesty check on the time-equivalent. One write = provenance + audit + bill line.

🙋 Customer

A fee-earner always has the final say; there's a clear, unchangeable record of who did what.

⚖ Compliance

Human sign-off + append-only audit = accountability + tamper-evidence. Edit-rate evidences output quality and keeps billed time defensible.

stack(Postgres tx + Drizzle)

Channels — the doors into the pipeline

calls (LiveKit)→ transcribe→ email (Graph)→ internal calls→ matter-match

Done when — a real call records → transcribes → enters the pipeline and lands on the right matter; an inbound email does the same without transcription; consent notice + lawful basis recorded per channel.

P4.1–4.2

Calls, recording & transcription

👤 Lay

Built-in calls that record automatically (with a notice), then turn into text on the firm's own machine.

⚙ Dev

LiveKit (self-hosted WebRTC SFU) + egress recording → object store; a webhook fires recording.ended {uri, matterHint}. A pg-boss job runs faster-whisper (in-tenant, batch) → transcript. Audio never leaves the tenant. Legacy phone lines: ingest existing call-recording files (bridge), or managed SIP via Twilio if a number is needed. Edge cases: diarisation for who-said-what; partials discarded; failed transcribe → retry then flag.

🙋 Customer

Calls are captured without anyone remembering to record; even the audio stays inside your walls.

⚖ Compliance

Recorded-call notice + lawful basis (legitimate interests + a documented LIA); recording + transcription stay in-tenant; ICO Jan-2026 — disclose AI-assisted analysis in the privacy notice; retention auto-delete on recording.retention_until.

stackLiveKit faster-whisper Twilio (legacy)

P4.3–4.4

Email, internal calls & matter-matching

👤 Lay

Emails and internal staff calls flow the same way, and the system works out which client/case each one belongs to.

⚙ Dev

Microsoft Graph: a /subscriptions webhook on the mailbox (delta query for new mail) + Teams call recordings; validate the webhook token; emails skip transcription. Each door normalises to {text, source, matterId?, participants, ts}. Matter-matching: rules first (caller number → party → matter; email sender/thread-id → matter), then a small MCP subagent for fuzzy cases; the matter list comes from the firm's practice-management system (Clio/LEAP) via API. Unmatched → an intake queue.

🙋 Customer

Email and internal calls get the same notes + time automatically, filed to the right matter.

⚖ Compliance

Email is already a record; internal calls need a staff notice. The same egress gate applies to every door. Correct matter attribution underpins confidentiality and accurate records.

stackMicrosoft Graph pgvector (match)

Native surfaces — bring online in order

time-ledger→ matter timeline→ inbox (bridge)→ tasks→ knowledge→ dashboards

Done when — each surface reads/writes the one spine, is AI-assisted via MCP, and the firm can run a full matter day inside the OS; bridges import from Outlook/DMS/Teams and can be retired.

P5.1–5.6

The surfaces

👤 Lay

The everyday screens: the timesheet, each matter's story, the inbox, the task board, the firm's saved know-how, and the partner's overview.

⚙ Dev

Time-ledger — query over time_entry grouped by actor/tier/day; the billing source.
Matter timeline — projection over audit_log + events for one matter (event-sourced view).
Inbox — Graph mirror; each message matched to a matter + an AI action chip.
Tasks/workflow — board schema (status flow + approval gates); MCP tool task.create/move.
Knowledge — embeddings in memory.embedding; ORDER BY embedding <=> query (pgvector) for semantic search of precedents.
Dashboards — materialised views refreshed on a schedule (matters active, time logged, needs-you).

Canvas/co-edit on tldraw + Yjs. Every surface exposes an MCP interface so subagents act on it.

🙋 Customer

Your whole operation in one place — and timekeeping becomes automatic. You can keep using Outlook/your document system during the move; they import and run alongside, then retire.

⚖ Compliance

One memory/audit/identity across surfaces = consistent retention, access and SAR handling. Bridges import-then-retire — no permanent third-party path in the steady state.

stacktldraw Yjs pgvector

Billing — the virtual fee-earner

usage_event → time-equiv→ rate card→ Stripe statement

Done when — a month of usage_event rolls up into AI-hours by tier, priced by the rate card, and produces a Stripe invoice that reads as a headcount line; the firm never sees tokens.

P6.1

Meter → time → bill

👤 Lay

The AI's work is turned into time (6-minute units) and billed like a member of staff.

⚙ Dev

# minutes-per-job × tier rate = the whole economics (benchmark before publishing) JOB_MINUTES = { attendance_note: 12, file_note: 9, summarise: 6, letter: 18, ... } TIER_RATE = { junior: £, paralegal: £, associate: £ } # per hour-equivalent # monthly rollup job units = Σ usage_event → minutesEquivalent per job → /60 → × TIER_RATE → Stripe invoice items (one line per tier) + platform fee + seats

Tokens stay internal cost + spend-cap; the firm's statement shows hours by tier. Stripe metered/invoice-items; monthly cron rollup.

🙋 Customer

A bill that reads "AI fee-earners: N hours" — a headcount line, not a software bill, and far below a human rate.

⚖ Compliance

Minutes-per-job must be defensible + shown transparently (or it reads as padding). Whether AI time is on-billable to the client is the firm's SRA/costs decision, not the vendor's.

stackStripe

Legal-gap tools (each a new billable job)

conflicts→ key-dates→ intake + engagement→ AML/ID→ precedents

Done when — each tool is a catalogue job with its own deliverable + time-equivalent, runs behind the egress gate, and has passed a per-tool regulatory check (stays back-office).

P7.1–7.5

The legal-only surfaces

👤 Lay

Tools a general office system doesn't have: checking new clients don't clash, tracking legal deadlines, onboarding + engagement letters, ID/AML checks, and a library of standard documents.

⚙ Dev

Conflicts — embed the new party, ORDER BY embedding <=> new over party; threshold → clear/near/conflict; near → human.
Key-dates — rules engine producing flags (never advice); writes reminders to tasks.
Intake + engagement — form → create matter + run an engagement_letter job.
AML/ID — call a checks provider (e.g. Onfido/ComplyAdvantage) API; store the result, not the raw documents beyond retention.
Precedents — template store in memory (scope=firm), pgvector search.

🙋 Customer

The legal-specific chores handled in the same place, each saving real time.

⚖ Compliance

Key-dates flag (never advise) to stay back-office; AML supports the firm's own MLR obligations; conflicts support confidentiality. Each new job gets a regulatory check (Q1) — none may cross into reserved activity (D4).

stackpgvectorAML/ID provider API

Harden & ship (before the first real firm)

tests→ monitoring→ backups/retention→ Cyber Essentials→ signing→ deploy

Done when — an E2E test proves call→signed-note with no raw text reaching the relay; errors/metrics are PII-free; encrypted backups + auto-delete run; Cyber Essentials passed; images signed; the bundle deploys into a real firm's cloud.

P8.1

Tests, monitoring, backups, retention

👤 Lay

Make sure it works end-to-end, watch for crashes, never lose data, and auto-delete old recordings on schedule.

⚙ Dev

Jest (unit) + Playwright E2E (call → transcript → redact → draft → sign → audit); a dedicated test asserts the relay received only masked text (the invariant, behaviourally). GlitchTip/Sentry for errors with a PII scrubber; OpenTelemetry counts only. Backups: pg_dump + object-store snapshot, encrypted to R2, restore-tested. Retention: nightly cron deletes where retention_until < now().

🙋 Customer

Reliable, with old recordings auto-deleted on a schedule you set, and backups you can restore.

⚖ Compliance

Retention/auto-delete + PII-free telemetry are explicit DP controls; the E2E test is positive evidence of the redact-before-model path; backup encryption + restore-test for resilience.

stackJest Playwright GlitchTip OpenTelemetry

P8.2

Certs, image signing, first deploy

👤 Lay

Get the basic UK security badge, lock the build so only approved code runs, then install for the first firm.

⚙ Dev

Cyber Essentials (self-assessment, ~£300–500, early). Cosign/Sigstore sign every image in CI; Watchtower applies only signed images (controlled promote). Deploy: docker compose pull && up -d on the firm's VM with their .env (§C). ISO 27001 deferred to P9.

🙋 Customer

Recognised security baseline; only approved versions can ever run; installed into your cloud.

⚖ Compliance

Cyber Essentials is the entry baseline UK firms ask for; signed images support handover/escrow trust (only an approved build runs). ISO 27001 when enterprise demand justifies it.

stackCyber Essentials Cosign Watchtower

Enterprise (only after a signed LOI)

P9.1

Multi-tenancy + ISO 27001

👤 Lay

Only once a big firm commits: support many offices in one install, and get the bigger security certificate.

⚙ Dev

Add a tenant_id to every table + a tenant RLS policy layered under the existing role/matter policies; per-tenant LiteLLM keys/quotas; per-tenant config/branding. Do not build before an LOI. Swap pg-boss → BullMQ/Valkey if throughput demands.

🙋 Customer

Scales to a large multi-office firm when needed.

⚖ Compliance

Tenant isolation by RLS; ISO 27001 (~£10–40k, 6–12 mo) for £100k+ RFPs — start only when the pipeline justifies it.

stackValkey BullMQ ISO 27001

§C · Environment variables (per tenant)

# database / sync DATABASE_URL=postgres://... ELECTRIC_URL=... # identity OIDC_ISSUER=... OIDC_CLIENT_ID=... OIDC_CLIENT_SECRET=... # relay + model (the key is n-law-issued, short-lived, rotated) RELAY_MASTER_KEY=... TENANT_CAP=... AZURE_UK_ENDPOINT=... AZURE_API_KEY=... ANTHROPIC_ZDR_ENDPOINT=... ANTHROPIC_API_KEY=... # storage / calls / billing / telemetry R2_BUCKET=... R2_KEY=... LIVEKIT_URL=... LIVEKIT_API_KEY=... STRIPE_KEY=... GLITCHTIP_DSN=... GRAPH_CLIENT_ID=... GRAPH_TENANT_ID=... # secrets via the host's secret manager; never committed; gitleaks in CI

§D · Services & repos — master index

Canonical project/service links. Licences as commonly understood mid-2026 — confirm before commercial use (esp. flagged).

What	Link	Licence / note
Turborepo	github.com/vercel/turborepo	MPL-2.0
pnpm	pnpm.io	MIT
Next.js	github.com/vercel/next.js	MIT
TypeScript	typescriptlang.org	Apache-2.0
Tailwind	tailwindcss.com	MIT
shadcn/ui	github.com/shadcn-ui/ui	MIT
PostgreSQL	postgresql.org	PostgreSQL Lic
pgvector	github.com/pgvector/pgvector	PostgreSQL Lic
Drizzle ORM	github.com/drizzle-team/drizzle-orm	Apache-2.0
zod	github.com/colinhacks/zod	MIT
ElectricSQL	github.com/electric-sql/electric	Apache-2.0
Auth.js	github.com/nextauthjs/next-auth	ISC
Keycloak	github.com/keycloak/keycloak	Apache-2.0
Microsoft Graph	learn.microsoft.com/graph	service (M365)
Docker/Compose	docker.com	Apache-2.0
Presidio	github.com/microsoft/presidio	MIT
spaCy	github.com/explosion/spaCy	MIT
Semgrep	github.com/semgrep/semgrep	LGPL-2.1 + free
Gitleaks	github.com/gitleaks/gitleaks	MIT
LiteLLM (relay)	github.com/BerriAI/litellm	MIT
Azure OpenAI	azure openai	service · UK South ZDR
Anthropic API	docs.anthropic.com	service · ZDR
OpenRouter	openrouter.ai	dev/synthetic only
MCP	modelcontextprotocol.io	open spec
pg-boss	github.com/timgit/pg-boss	MIT
LiveKit	github.com/livekit/livekit	Apache-2.0
faster-whisper	github.com/SYSTRAN/faster-whisper	MIT
tldraw	github.com/tldraw/tldraw	tldraw licence — check
Yjs	github.com/yjs/yjs	MIT
Cloudflare R2	developers.cloudflare.com/r2	service
SeaweedFS	github.com/seaweedfs/seaweedfs	Apache-2.0
Stripe	stripe.com/docs	service
GlitchTip	gitlab.com/glitchtip	MIT
OpenTelemetry	opentelemetry.io	Apache-2.0
Jest	github.com/jestjs/jest	MIT
Playwright	github.com/microsoft/playwright	Apache-2.0
Cosign/Sigstore	github.com/sigstore/cosign	Apache-2.0
Watchtower	github.com/containrrr/watchtower	Apache-2.0
Valkey	github.com/valkey-io/valkey	BSD-3
BullMQ	github.com/taskforcesh/bullmq	MIT
Twilio	twilio.com/docs/voice	service
Cyber Essentials	ncsc.gov.uk	cert (~£300–500)
ISO 27001	iso.org	cert (later)

drop-list / watch-list

MinIO dropped (maintenance-mode, no security patches) → R2/S3 or SeaweedFS. Firebase/Firestore forbidden (Google-hosted, breaks in-tenant) → ElectricSQL. CodeQL needs paid GitHub Advanced Security for private repos → Semgrep. Sentry FSL = fine to self-host internally. tldraw + Semgrep terms = confirm for commercial use.

Source of record: task_plan.md (D1–D12) + STUDY.md + findings.md. Build doc v3 (full-detail) · 13 Jun 2026 · time-ordered P0–P9 · 4 reader angles, click any column to switch · code/schemas are reference sketches for build, not final source · links canonical, not re-checked this build · not legal advice — DP + regulatory posture needs lawyer sign-off before any real client data.