Sherpii Platform — Technical Plan

1. Overview

Sherpii is a multi-tenant, multi-caregiver clinic management platform designed to support mental health and allied health clinics of varying sizes. Each clinic operates as an isolated tenant with its own subdomain, user roles, patient data, and configuration. The platform consolidates session management, clinical note-taking, AI-assisted summarisation, patient communications, homework tracking, and escalation alerting into a single cohesive product.

The core tech stack is React 18 / TypeScript with shadcn/ui on the front end, a Node.js 20 / TypeScript API server on the back end, PostgreSQL as the primary relational database managed via Prisma ORM, and AWS Cognito for identity and authentication. All infrastructure runs on AWS and is provisioned as code using Terraform or AWS CDK. The monorepo is managed with pnpm workspaces.

2. Repository & Monorepo Structure

The project is a pnpm workspace monorepo. All packages share a root-level pnpm-workspace.yaml and a root package.json for scripts and shared dev dependencies.

sherpii/
├── pnpm-workspace.yaml
├── package.json                  # root scripts, shared devDeps
├── biome.json                    # or .eslintrc.js at root
├── apps/
│   ├── web/                      # React/Vite frontend PWA
│   │   ├── src/
│   │   ├── public/
│   │   ├── vite.config.ts
│   │   └── package.json
│   └── api/                      # Fastify API server (Node.js, TypeScript)
│       ├── src/
│       │   ├── routes/
│       │   ├── services/
│       │   ├── jobs/
│       │   ├── plugins/
│       │   └── server.ts
│       └── package.json
├── packages/
│   ├── ui/                       # Shared shadcn/ui component library
│   │   ├── src/components/
│   │   ├── src/index.ts
│   │   └── package.json
│   ├── types/                    # Shared TypeScript types + Zod schemas
│   │   ├── src/
│   │   └── package.json
│   ├── db/                       # Prisma client + migrations
│   │   ├── prisma/
│   │   │   ├── schema.prisma
│   │   │   └── migrations/
│   │   ├── src/
│   │   └── package.json
│   └── config/                   # Shared configs: eslint, tsconfig, tailwind
│       ├── eslint/
│       ├── tsconfig/
│       │   ├── base.json
│       │   ├── react.json
│       │   └── node.json
│       └── tailwind/
└── infra/                        # Terraform modules (or AWS CDK)
    ├── modules/
    │   ├── ecs/
    │   ├── rds/
    │   ├── cognito/
    │   ├── redis/
    │   └── s3/
    ├── environments/
    │   ├── dev/
    │   ├── staging/
    │   └── production/
    └── main.tf

Key workspace conventions:

packages/types and packages/db are consumed by both apps/web and apps/api
packages/ui is consumed only by apps/web (and potentially a future patient-facing PWA)
Internal packages use "sherpii-*" naming (e.g. sherpii-ui, sherpii-types)
TypeScript project references are used to keep incremental builds fast

3. Frontend Stack

Core Libraries

Concern	Library	Version
Framework	React	18.x
Language	TypeScript	5.x
Build tool	Vite	5.x
Component system	shadcn/ui (Radix UI + Tailwind)	latest
Styling	Tailwind CSS	v3.x
Routing	React Router	v6.x
Server state	TanStack Query (React Query)	v5.x
Client state	Zustand	v4.x
Forms	React Hook Form + Zod	latest
Date handling	date-fns	v3.x
Animations	Framer Motion	v11.x
Icons	Lucide React	latest
PWA	`vite-plugin-pwa` (Workbox)	latest

Architecture Decisions

Route-based code splitting. All top-level routes are lazy-loaded via React.lazy + Suspense. Each feature area (dashboard, calendar, sessions, patients, comms, reports) is a separate chunk. This keeps the initial bundle under 200 KB gzipped.

// apps/web/src/router.tsx
const Dashboard = React.lazy(() => import('./features/dashboard'));
const Calendar = React.lazy(() => import('./features/calendar'));
const Sessions = React.lazy(() => import('./features/sessions'));

TanStack Query as the data layer. All server data (patients, sessions, notes) lives in the React Query cache. Mutations use onMutate + onError rollback for optimistic UI updates — especially for note saving and calendar event drag-and-drop.

Zustand for client-only state. Used for: active session state (recording in progress, transcription buffer), sidebar collapse state, notification panel open/close, and ephemeral form drafts. Zustand stores are colocated with their feature folder and are not persisted to localStorage except for non-sensitive UI preferences.

Real-time. A singleton WebSocket connection is established on login (socket.io-client or native WebSocket). The connection subscribes to clinic:{clinicId} and user:{userId} channels. Events emitted from the server:

transcription:chunk — partial transcript text for the live transcription panel
alert:new — new alert triggers an in-app notification badge
session:updated — another caregiver edited a shared session

A Server-Sent Events fallback (EventSource) is used for one-way feeds (alerts, transcription) if the WebSocket connection fails.

Feature flags. On app boot, a lightweight GET /api/config endpoint returns a features map keyed by clinic ID. This controls Progressive rollout of beta features (e.g. AI summarisation, WhatsApp integration). The flag map is stored in Zustand and consumed by a useFeatureFlag(key) hook.

PWA. The vite-plugin-pwa plugin generates a service worker using Workbox. Cache strategy:

App shell (HTML, JS, CSS): CacheFirst
API responses: NetworkFirst with a 5-second timeout fallback
Static assets (icons, fonts): CacheFirst with 30-day expiry
The app is installable on mobile and desktop (Web App Manifest, splash screens, icons)

4. Backend Stack

Core Libraries

Concern	Library
Runtime	Node.js 20 LTS
Language	TypeScript 5.x
HTTP framework	Fastify v4
ORM	Prisma v5
Database	PostgreSQL 15
Validation	Zod
Job queues	BullMQ + ioredis
Auth middleware	`@fastify/jwt` + Cognito JWKS
API docs	`fastify-swagger` + `@fastify/swagger-ui`
AI	Anthropic SDK / OpenAI SDK
HTTP client	`undici` (built into Node 18+)
Logging	pino (Fastify default)

API Design Principles

RESTful JSON with OpenAPI 3.1. All routes are documented via Fastify's schema system. The OpenAPI spec is auto-generated at GET /api/docs/openapi.json and served via Swagger UI in non-production environments.

Consistent error format:

{
  "code": "PATIENT_NOT_FOUND",
  "message": "No patient found with the given ID",
  "details": { "patientId": "abc-123" }
}

Cursor-based pagination for all list endpoints:

{
  "data": [...],
  "pagination": {
    "nextCursor": "eyJpZCI6IjEyMyJ9",
    "hasMore": true,
    "limit": 20
  }
}

Multi-tenancy enforcement. Every Fastify route handler receives a clinicId extracted from the verified JWT. All Prisma queries include where: { clinicId } as a hard constraint. A custom Fastify plugin (clinicScope.plugin.ts) injects this at the request lifecycle level — it is not left to individual route handlers to remember.

// Example scoped query
async function getPatients(clinicId: string, cursor?: string) {
  return prisma.patient.findMany({
    where: { clinicId, status: 'ACTIVE' },
    cursor: cursor ? { id: cursor } : undefined,
    take: 21,
    orderBy: { createdAt: 'desc' },
  });
}

5. Authentication & Authorisation

Flow

User navigates to {clinic}.sherpii.io
Frontend resolves the clinicId from the subdomain via GET /api/clinics/resolve?subdomain={subdomain}
Frontend redirects to the Cognito Hosted UI with PKCE (code_challenge, code_verifier)
After successful login, Cognito redirects back with an authorisation code
Frontend exchanges the code for tokens (/oauth2/token) and receives:
- access_token (JWT, stored in memory only)
- id_token (JWT, stored in memory only)
- refresh_token (stored in an httpOnly, Secure, SameSite=Strict cookie)
Every API request includes Authorization: Bearer {access_token}
The API validates the JWT signature against Cognito's JWKS endpoint (/.well-known/jwks.json) using @fastify/jwt. The JWKS is cached in memory with a 12-hour TTL.

No session state on the server. The API is fully stateless. The refresh_token cookie is handled by a dedicated POST /api/auth/refresh endpoint which exchanges it with Cognito and returns a new access token.

Roles & Permissions

Role	Description
`CAREGIVER`	Sees own patients and sessions only
`SUPERVISOR`	Sees all patients within the clinic; can review caregiver notes
`ADMIN`	Full clinic access; manages users, settings, integrations

Roles are stored as Cognito group memberships and mirrored in the User.role column in PostgreSQL for fast query-time access checks. A requireRole(roles: Role[]) Fastify hook enforces role access per route group.

Row-Level Security

Beyond role checks, all queries are scoped:

Caregivers: WHERE clinicId = $1 AND caregiverId = $2
Supervisors/Admins: WHERE clinicId = $1

This is enforced in a service layer, not just in route handlers, to prevent accidental data leaks via internal service calls.

6. Database Schema (Prisma)

model Clinic {
  id        String   @id @default(cuid())
  name      String
  subdomain String   @unique
  plan      String   @default("starter")
  settings  Json     @default("{}")
  createdAt DateTime @default(now())

  users    User[]
  patients Patient[]
}

model User {
  id          String   @id @default(cuid())
  clinicId    String
  cognitoSub  String   @unique
  role        Role
  name        String
  email       String
  preferences Json     @default("{}")
  createdAt   DateTime @default(now())

  clinic   Clinic    @relation(fields: [clinicId], references: [id])
  patients Patient[]
  sessions Session[]
  alerts   Alert[]
}

model Patient {
  id            String        @id @default(cuid())
  clinicId      String
  caregiverId   String
  name          String
  dob           DateTime?
  diagnosis     String?
  status        PatientStatus @default(ACTIVE)
  contacts      Json          @default("[]")
  consentStatus ConsentStatus @default(PENDING)
  createdAt     DateTime      @default(now())

  clinic    Clinic     @relation(fields: [clinicId], references: [id])
  caregiver User       @relation(fields: [caregiverId], references: [id])
  sessions  Session[]
  notes     Note[]
  signals   Signal[]
  homework  Homework[]
  messages  Message[]
  alerts    Alert[]
  reports   Report[]
}

model Session {
  id           String        @id @default(cuid())
  patientId    String
  clinicId     String
  caregiverId  String
  scheduledAt  DateTime
  startedAt    DateTime?
  endedAt      DateTime?
  type         SessionType
  status       SessionStatus @default(SCHEDULED)
  recordingUrl String?
  transcriptUrl String?
  createdAt    DateTime      @default(now())

  patient   Patient @relation(fields: [patientId], references: [id])
  caregiver User    @relation(fields: [caregiverId], references: [id])
  notes     Note[]
}

model Note {
  id         String   @id @default(cuid())
  sessionId  String
  patientId  String
  clinicId   String
  templateId String?
  body       Json     @default("{}")
  summary    String?
  createdAt  DateTime @default(now())
  updatedAt  DateTime @updatedAt

  session Session @relation(fields: [sessionId], references: [id])
  patient Patient @relation(fields: [patientId], references: [id])
}

model Signal {
  id          String     @id @default(cuid())
  patientId   String
  clinicId    String
  type        SignalType
  value       Float
  measuredAt  DateTime
  metadata    Json       @default("{}")

  patient Patient @relation(fields: [patientId], references: [id])
}

model Homework {
  id             String         @id @default(cuid())
  patientId      String
  clinicId       String
  description    String
  assignedAt     DateTime       @default(now())
  dueAt          DateTime?
  completedAt    DateTime?
  followUpRule   Json           @default("{}")
  status         HomeworkStatus @default(PENDING)

  patient Patient @relation(fields: [patientId], references: [id])
}

model Message {
  id        String          @id @default(cuid())
  patientId String
  clinicId  String
  direction MessageDirection
  channel   MessageChannel
  content   String
  metadata  Json            @default("{}")
  sentAt    DateTime        @default(now())
  readAt    DateTime?

  patient Patient @relation(fields: [patientId], references: [id])
}

model Alert {
  id              String      @id @default(cuid())
  clinicId        String
  caregiverId     String
  type            AlertType
  priority        Priority
  patientId       String?
  status          AlertStatus @default(OPEN)
  escalationStep  Int         @default(0)
  createdAt       DateTime    @default(now())
  acknowledgedAt  DateTime?

  caregiver User    @relation(fields: [caregiverId], references: [id])
  patient   Patient? @relation(fields: [patientId], references: [id])
}

model Report {
  id          String       @id @default(cuid())
  patientId   String
  clinicId    String
  type        ReportType
  generatedAt DateTime     @default(now())
  pdfUrl      String?
  sentTo      Json         @default("[]")
  status      ReportStatus @default(PENDING)

  patient Patient @relation(fields: [patientId], references: [id])
}

model AuditLog {
  id           String   @id @default(cuid())
  clinicId     String
  userId       String
  action       String
  resourceType String
  resourceId   String
  metadata     Json     @default("{}")
  timestamp    DateTime @default(now())

  @@index([clinicId, timestamp])
  @@index([resourceType, resourceId])
}

enum Role            { CAREGIVER SUPERVISOR ADMIN }
enum PatientStatus   { ACTIVE INACTIVE DISCHARGED }
enum ConsentStatus   { PENDING GIVEN WITHDRAWN }
enum SessionType     { INDIVIDUAL GROUP INTAKE CRISIS }
enum SessionStatus   { SCHEDULED IN_PROGRESS COMPLETED CANCELLED NO_SHOW }
enum SignalType      { PHQ9 GAD7 MOOD CUSTOM }
enum HomeworkStatus  { PENDING IN_PROGRESS COMPLETED OVERDUE CANCELLED }
enum MessageDirection{ IN OUT }
enum MessageChannel  { EMAIL WHATSAPP TELEGRAM SMS VOICE }
enum AlertType       { CRISIS OVERDUE_HOMEWORK MISSED_SESSION SIGNAL_THRESHOLD }
enum Priority        { LOW MEDIUM HIGH CRITICAL }
enum AlertStatus     { OPEN ACKNOWLEDGED RESOLVED ESCALATED }
enum ReportType      { SESSION_SUMMARY PROGRESS DISCHARGE INTAKE }
enum ReportStatus    { PENDING GENERATING COMPLETE FAILED }

Index strategy:

Patient: index on (clinicId, caregiverId, status)
Session: index on (clinicId, caregiverId, scheduledAt)
Message: index on (clinicId, patientId, sentAt)
Alert: index on (clinicId, caregiverId, status, priority)
AuditLog: compound index on (clinicId, timestamp) for compliance reporting

7. Integrations

7.1 Calendar

Google Calendar

OAuth2 via googleapis Node.js SDK
Scopes: calendar.events, calendar.readonly
OAuth tokens stored encrypted in User.preferences (or a dedicated CalendarToken table)
Sync strategy:
1. Push (primary): Register a watch channel on the calendar. Google sends change notifications to POST /api/webhooks/google-calendar. On notification, fetch the changed event and upsert into the Session table.
2. Pull (fallback): A BullMQ job runs every 15 minutes to pull events from the past 7 days as a consistency check.
Appointment matching: incoming events are matched to existing patients by attendee email address first, then by fuzzy full-name match (using fuse.js). Unmatched events are surfaced to the caregiver for one-time confirmation.

Microsoft Outlook / Exchange

Microsoft Graph API via @microsoft/microsoft-graph-client
OAuth2 via @azure/msal-node (MSAL)
Scopes: Calendars.Read, Calendars.ReadWrite
Change notifications via Microsoft Graph subscription (/subscriptions) with a webhook endpoint
Same pull fallback pattern as Google

7.2 Email

Primary provider: AWS SES with verified sending domain (mail.sherpii.io)
Fallback / transactional: SendGrid (for clinics on the pro plan who want marketing-style templates)
Template rendering: @react-email/components for type-safe HTML email templates compiled to static HTML
Use cases: appointment confirmation, homework assignment, report delivery (with PDF attachment), alert escalation to supervisor
PDF generation: Puppeteer (headless Chromium) running inside a Docker task on ECS Fargate. The report HTML is rendered server-side and printed to PDF, then uploaded to S3. A pre-signed URL (15-minute TTL) is included in the email.

7.3 WhatsApp

Provider: Twilio WhatsApp Business API (or Meta Cloud API directly at scale)
Inbound flow: POST /api/webhooks/twilio-whatsapp → verify Twilio signature → parse Body and From → look up Patient by phone number → create Message record → if within 24-hour service window, optionally trigger AI auto-reply draft for caregiver review
Outbound flow: Messages routed through message-queue in BullMQ. Rate limiting: max 10 messages per patient per hour. Template messages (pre-approved by Meta) used for all scheduled outbound touchpoints.
24-hour window management: A Redis key tracks the last inbound message timestamp per patient. Outbound free-form messages are only dispatched within the window; after expiry, a template message is sent to re-open the window.

7.4 Telegram

Library: grammy (TypeScript-first Telegram Bot API framework)
Bot architecture: One bot per environment (dev/staging/prod). Clinic routing is handled via a deep-link start token: https://t.me/SherpiiBot?start={clinicToken}_{patientToken}. On /start, the bot resolves the patient and stores the chatId against the Patient record.
Inbound messages: Telegram Update → webhook (POST /api/webhooks/telegram) → route to patient → create Message record
Inline keyboards: Homework completion confirmation, appointment reminders, and mood check-ins use Telegram's InlineKeyboard for structured one-tap responses. Button callbacks update the relevant Homework or Signal record.

7.5 Voice

Provider: Twilio Programmable Voice
Outbound calls: Initiated via Twilio REST API. TwiML <Say> and <Gather> verbs deliver pre-recorded or TTS reminders. Patient key-press responses (e.g. "press 1 to confirm") are captured and stored.
Escalation calls: When an Alert reaches escalationStep >= 2, a voice call is placed to the caregiver's registered mobile number using a TwiML Bin with the patient name and alert type.
Call recording: Optional, enabled per-clinic with explicit patient consent stored in Patient.consentStatus. Recordings stored in S3 with server-side encryption.

7.6 Transcription

Live streaming (in-session)

Browser MediaRecorder API captures audio chunks (16 kHz, mono, 250 ms intervals)
Chunks are sent over the WebSocket connection as binary ArrayBuffer
The API server pipes chunks to Amazon Transcribe Streaming via the @aws-sdk/client-transcribe-streaming SDK
Partial and full transcripts are pushed back over the WebSocket to the frontend for the live transcription panel

Post-session (async)

Full recording is uploaded to S3 (recordings/{clinicId}/{sessionId}.webm)
S3 ObjectCreated event triggers an SQS message
transcription-queue BullMQ worker picks up the job, calls Amazon Transcribe (standard) or Whisper API
Completed transcript stored in Session.transcriptUrl (S3 JSON) and a plain-text version cached in the DB

Language support: Hebrew, English, and Arabic via Amazon Transcribe Standard. For clinical terminology accuracy in Hebrew, Amazon Transcribe Medical is evaluated per clinic. Whisper large-v3 is the fallback for languages or accents with lower Amazon accuracy.

7.7 AI Summarisation

Models: Claude 3.5 Sonnet (primary, for quality) / Claude 3 Haiku (fast draft for lower-tier plans) via the Anthropic SDK. GPT-4o-mini as an optional alternative for clinics that require it contractually.

Prompt construction:

[System prompt — clinical tone, privacy constraints, output format]
[Clinic context — modality, treatment model, caregiver name]   ← cached
[Patient context — diagnosis, recent history, last session summary]
[Current session — structured notes JSON + transcript]
[Instruction — produce: Presenting Concern / Key Themes / Plan / Homework]

Prompt caching: The Anthropic SDK cache_control parameter is applied to the system prompt + clinic context block. This reduces latency on the second+ summary for a given caregiver from ~4 s to ~1.5 s and cuts token costs by ~75% for the cached prefix.
Human-in-the-loop: AI output is stored as a summary_draft field on the Note record. The caregiver sees a diff-style review UI. Note.summary is only set after explicit caregiver approval. AI output is never auto-persisted.
Audit trail: Every AI call logs the model, input token count, output token count, and latency to AuditLog with action: AI_SUMMARY_GENERATED.

8. Background Jobs (BullMQ + Redis)

All queues use BullMQ backed by ElastiCache Redis. The worker process is a separate ECS Fargate task from the API server, sharing the same Docker image but with a different CMD (node dist/worker.js).

Queue Definitions

Queue	Trigger	Job
`transcription-queue`	Session `endedAt` set	Fetch audio from S3 → call Transcribe → store transcript
`summary-queue`	Transcript available OR note saved manually	Fetch notes + transcript → call AI API → store draft → notify caregiver via WS
`followup-queue`	Homework `dueAt` passed	Check `completedAt` → send reminder if overdue → reschedule check based on `followUpRule`
`report-queue`	Report requested	Render HTML → Puppeteer PDF → upload to S3 → send email
`alert-escalation-queue`	Alert created or escalation step scheduled	Check `acknowledgedAt` → if no ack within SLA, fire next escalation step (in-app → email → voice call)
`message-queue`	Outbound message created	Rate-limit check → call channel API (Twilio/SES/Telegram) → update `Message.sentAt`

Retry & Error Handling

const defaultJobOptions: DefaultJobOptions = {
  attempts: 3,
  backoff: {
    type: 'exponential',
    delay: 2000, // 2s, 4s, 8s
  },
  removeOnComplete: { count: 100 },
  removeOnFail: false, // keep failed jobs for inspection
};

Failed jobs after max retries are moved to a dead-letter queue ({queue-name}:failed). A separate monitoring job scans dead-letter queues every 5 minutes and fires a Slack notification + CloudWatch alarm if the count exceeds a threshold.

9. Real-time

Server-side: socket.io v4 with the Redis adapter (@socket.io/redis-adapter) so that WebSocket events are broadcast correctly across multiple ECS API task instances.

Connection lifecycle:

Client connects on login, sends AUTH event with the access token
Server validates the JWT and joins the socket to rooms: clinic:{clinicId} and user:{userId}
Caregiver starting a session joins session:{sessionId} for transcription events

Events:

Event	Direction	Payload
`transcription:chunk`	Server → client	`{ sessionId, text, isFinal }`
`alert:new`	Server → client	`{ alertId, type, priority, patientName }`
`alert:updated`	Server → client	`{ alertId, status }`
`session:updated`	Server → client	`{ sessionId, updatedBy }`
`dashboard:stats`	Server → client	`{ activeAlerts, todaySessions }`

SSE fallback: If WebSocket fails (corporate proxies, etc.), the client falls back to EventSource on GET /api/sse/alerts and GET /api/sse/transcription/{sessionId}. These are one-way feeds and cover the most critical real-time use cases.

10. Infrastructure (AWS)

Compute

ECS Fargate — two task definitions:
- sherpii-api: Fastify server, auto-scaled on CPU + request count (min 2, max 10 tasks)
- sherpii-worker: BullMQ worker, auto-scaled on Redis queue depth (custom CloudWatch metric)
Both tasks pull Docker images from ECR (Elastic Container Registry)

Data

RDS PostgreSQL 15 — Multi-AZ in production. db.t4g.medium for staging, db.r7g.large for production. Automated daily snapshots retained for 30 days.
ElastiCache Redis 7 — Single-node cache.t4g.medium for staging, cluster mode with 2 shards for production. Used by BullMQ and socket.io adapter.

Storage & Delivery

S3 — Separate buckets per environment:
- sherpii-recordings-{env} — audio files (encrypted, SSE-KMS)
- sherpii-reports-{env} — generated PDF reports (SSE-S3)
- sherpii-documents-{env} — patient intake forms, consent docs
CloudFront — CDN for:
- Frontend SPA (served from S3 bucket)
- Pre-signed S3 URLs for secure file delivery with short TTLs

Auth & Routing

Cognito User Pool — one per environment. Custom domain: auth.sherpii.io. App clients per environment. Hosted UI with clinic-branded CSS override.
Route 53 — Wildcard DNS record *.sherpii.io → CloudFront for clinic subdomains. The frontend resolves clinic context from the subdomain on boot.
Application Load Balancer — fronts ECS Fargate tasks. HTTPS termination at ALB.

Async / Eventing

SQS — Dead-letter queues for BullMQ failures and S3 event notifications (transcription trigger)
SNS — Routes CloudWatch alarms to PagerDuty and Slack

CI/CD (GitHub Actions)

pull_request:
  1. pnpm install
  2. biome lint + type-check (tsc --noEmit)
  3. vitest run (unit + integration with Docker Compose PostgreSQL)
  4. playwright (E2E, staging environment)

push to main:
  1-4 above
  5. docker buildx build → push to ECR (tagged with git SHA)
  6. terraform plan (PR comment) / terraform apply (main only)
  7. ecs update-service --force-new-deployment (blue/green via ECS circuit breaker)

Environments:

dev — shared, always-on, seeded with demo data
staging — per-PR ephemeral environment spun up via Terraform workspace (destroyed 24 h after PR merge)
production — promoted from main on manual approval gate

11. Security & Compliance

HIPAA (US Patients)

Business Associate Agreements (BAA) in place with: AWS, Twilio, Anthropic (or OpenAI), SendGrid
PHI encryption at rest: RDS encryption enabled (AES-256 via KMS), S3 SSE-KMS for sensitive buckets
TLS 1.2+ enforced on all endpoints (ALB security policy ELBSecurityPolicy-TLS13-1-2-2021-06)
Audit logging: every PHI access (read, write, delete) emits an AuditLog record + a structured CloudTrail event
Minimum-necessary access: caregivers cannot query patients outside their caseload; the API enforces this at the service layer
Employee access: IAM roles with least-privilege; no developer has direct RDS access in production — changes via migrations only

GDPR (EU Patients)

Data residency: the eu-west-1 (Ireland) deployment option is available; Terraform workspace parameter AWS_REGION controls this
Right to erasure: DELETE /api/patients/{id} performs a soft-delete (sets status = DELETED, obfuscates PII fields). A nightly hard-delete pipeline permanently removes records older than the retention period (configurable per clinic)
Data processing agreements (DPAs) maintained with all sub-processors
Consent: Patient.consentStatus tracks consent per data category (recording, AI analysis, third-party sharing) with timestamps

Application Security

Threat	Mitigation
SQL injection	Prisma parameterised queries (no raw SQL except reviewed migrations)
XSS	React's default escaping; strict CSP headers (`script-src 'self'`)
CSRF	`SameSite=Strict` cookies; `Origin` header validation on state-changing requests
Rate limiting	Per-IP: 100 req/min; Per-user: 1000 req/min; per-endpoint overrides for auth routes (10 req/min) via `@fastify/rate-limit`
Secrets leakage	All secrets in AWS Secrets Manager; injected as env vars at ECS task definition time; no `.env` files in the repo
Dependency vulnerabilities	Dependabot weekly PRs + `pnpm audit` in CI as a blocking check
Auth bypass	JWT audience + issuer validated; short access token TTL (15 min); refresh token rotation

A third-party penetration test is scheduled in the final week before the Phase A launch.

12. Observability

Logging

pino (Fastify's built-in logger) outputs structured JSON logs to stdout
ECS sends stdout to CloudWatch Logs (/sherpii/api/{env}, /sherpii/worker/{env})
Log fields: requestId, clinicId, userId, route, statusCode, responseTimeMs, error (if applicable)
CloudWatch Insights saved queries for: error rate by route, slow requests (p99 > 1 s), AI call latency distribution

Tracing

AWS X-Ray — aws-xray-sdk-node wraps the Fastify HTTP server, Prisma client, and external HTTP calls
Traces are sampled at 5% in production, 100% in staging
Useful for diagnosing slow AI summary calls and transcription pipeline latency

Metrics (Custom CloudWatch)

Metric	Alarm Threshold
`API/ErrorRate5xx`	> 1% over 5 min → PagerDuty P2
`Worker/QueueDepth[transcription]`	> 50 jobs → Slack warning
`Worker/QueueDepth[alert-escalation]`	> 10 jobs → PagerDuty P1
`AI/SummaryLatencyP99`	> 10 s → Slack warning
`DB/ConnectionCount`	> 80% of max → PagerDuty P2

Frontend Error Tracking

Sentry (browser SDK) — captures unhandled errors and React Error Boundary events
Source maps uploaded to Sentry in the CI build step (maps deleted from the public S3 bundle)
Alerts for new error types sent to the #eng-alerts Slack channel

13. Testing Strategy

Unit Tests (Vitest)

Location: colocated with source files as *.test.ts / *.test.tsx
Coverage: pure functions (date transformations, signal calculators, Zod schema validators, prompt builders)
Frontend component unit tests: @testing-library/react + Vitest + msw (Mock Service Worker) for API mocking
Run on every commit in CI

Integration Tests (Supertest + Fastify test harness)

Spin up the full Fastify app in test mode with a real PostgreSQL database (Docker Compose in CI)
Test each route: happy path + auth failure + validation error + clinicId isolation
Use Prisma's $transaction rollback pattern to reset state between tests (fast)
Run on every PR

E2E Tests (Playwright)

Critical paths only (to keep CI fast):
1. Login flow (Cognito Hosted UI → app)
2. Create patient → schedule session → open session
3. Save note → trigger AI summary → approve summary
4. Generate report → verify PDF download
5. Trigger crisis alert → acknowledge alert
Run against the staging environment on merge to main
Screenshots and video on failure uploaded to S3

Contract Tests

The OpenAPI 3.1 spec (generated from Fastify schemas) is committed to the repo
A Vitest test suite validates that the spec matches actual Fastify route schemas (fastify-swagger strict mode)
The frontend uses the generated TypeScript client (via openapi-typescript) — type errors in CI catch schema drift

Visual Regression (Storybook + Chromatic)

packages/ui has a Storybook with stories for every component
Chromatic runs on every PR and blocks merge if visual diffs are detected
Baseline snapshots are updated manually after approved UI changes

Coverage Targets

Area	Target
Domain logic (services, validators, prompt builders)	80% line coverage
API route handlers	70% (via integration tests)
UI components	60% (Storybook stories + unit tests)
E2E critical paths	100% of the 5 defined paths green

14. Phase A vs Phase B Scope

Phase A — Launch (Months 1–4)

All core platform screens and flows:

Auth / Login — Cognito PKCE flow, clinic subdomain resolution, role-based redirect
Dashboard — Today's sessions, active alerts, quick stats, real-time updates
Calendar — Week/day view, Google + Outlook sync, appointment creation, drag-and-drop rescheduling
Sessions — Start/end session, live transcription panel, note editor (structured template + free text), AI summary review
Patients — Patient list, profile, history timeline, signal charts (PHQ-9, GAD-7, mood)
Communications — Unified inbox (WhatsApp, Telegram, email), outbound message composer, automated homework reminders
Alerts — Alert feed, acknowledgement flow, escalation step tracking, crisis flagging
Reports — Generate session summary, progress, and discharge reports; PDF delivery via email

Phase A additional modules:

Intake forms (custom form builder, patient self-completion link)
Digital consent management (per data category, timestamped)
Crisis flagging (auto-detect trigger phrases in transcripts + manual flag)
Standardised assessments (PHQ-9, GAD-7, custom questionnaire engine)
Goal tracking (create goals, link to sessions, mark progress)
Audit log viewer (admin-only, filterable by user, action, resource)

Phase B — 6–12 Month Roadmap

Billing & invoicing — Stripe integration, invoice generation per session, subscription management for clinic plans
Patient mobile PWA — A separate installable PWA for patients: view homework, complete check-ins, exchange messages with their caregiver
Supervision portal — Supervisor dashboard: caregiver caseload overview, session note review, countersignature workflow
Waitlist management — Waitlist queue, automated slot-opening notifications, referral tracking
Clinic analytics dashboard — Aggregate metrics: session volume, outcome trends, caregiver utilisation, alert resolution times
Public REST API & webhooks — Documented external API for EHR integrations; webhook subscriptions for events (session completed, alert fired)

15. Estimated Team

Role	Count	Notes
Senior full-stack engineer (React + Node)	2	Own features end-to-end; one takes tech-lead role
DevOps / infrastructure engineer	1	AWS, Terraform, CI/CD, security hardening
Product designer	1	Figma designs handed off to `packages/ui` as shadcn/ui components
QA engineer (part-time)	1	E2E test authoring in Playwright, exploratory testing, release sign-off

Recommended build timeline: approximately 4 months to Phase A MVP with this team composition, assuming:

Designs are available by the end of week 2
Third-party API credentials (Twilio, Cognito, Google/Microsoft OAuth apps) are provisioned by the end of week 1
No major scope changes after week 6

Milestones:

Week	Milestone
1–2	Monorepo scaffold, infra bootstrap (ECS, RDS, Cognito), auth flow end-to-end
3–5	Core data models (Prisma), patients CRUD, sessions CRUD, calendar sync (Google)
6–8	Note editor, live transcription (WebSocket), AI summary flow
9–11	Communications (WhatsApp, Telegram, email), alert engine, escalation queues
12–14	Reports (PDF), assessments, intake forms, consent, Phase A audit + security hardening
15–16	E2E testing, pen test remediation, performance tuning, production launch