Technical Plan · Confidential

Sherpii Platform

Architecture, stack decisions, and engineering guidelines.

← Back to Product Book

Sherpii Platform — Technical Plan


1. Overview

Sherpii is a multi-tenant, multi-caregiver clinic management platform designed to support mental health and allied health clinics of varying sizes. Each clinic operates as an isolated tenant with its own subdomain, user roles, patient data, and configuration. The platform consolidates session management, clinical note-taking, AI-assisted summarisation, patient communications, homework tracking, and escalation alerting into a single cohesive product.

The core tech stack is React 18 / TypeScript with shadcn/ui on the front end, a Node.js 20 / TypeScript API server on the back end, PostgreSQL as the primary relational database managed via Prisma ORM, and AWS Cognito for identity and authentication. All infrastructure runs on AWS and is provisioned as code using Terraform or AWS CDK. The monorepo is managed with pnpm workspaces.


2. Repository & Monorepo Structure

The project is a pnpm workspace monorepo. All packages share a root-level pnpm-workspace.yaml and a root package.json for scripts and shared dev dependencies.

sherpii/
├── pnpm-workspace.yaml
├── package.json                  # root scripts, shared devDeps
├── biome.json                    # or .eslintrc.js at root
├── apps/
│   ├── web/                      # React/Vite frontend PWA
│   │   ├── src/
│   │   ├── public/
│   │   ├── vite.config.ts
│   │   └── package.json
│   └── api/                      # Fastify API server (Node.js, TypeScript)
│       ├── src/
│       │   ├── routes/
│       │   ├── services/
│       │   ├── jobs/
│       │   ├── plugins/
│       │   └── server.ts
│       └── package.json
├── packages/
│   ├── ui/                       # Shared shadcn/ui component library
│   │   ├── src/components/
│   │   ├── src/index.ts
│   │   └── package.json
│   ├── types/                    # Shared TypeScript types + Zod schemas
│   │   ├── src/
│   │   └── package.json
│   ├── db/                       # Prisma client + migrations
│   │   ├── prisma/
│   │   │   ├── schema.prisma
│   │   │   └── migrations/
│   │   ├── src/
│   │   └── package.json
│   └── config/                   # Shared configs: eslint, tsconfig, tailwind
│       ├── eslint/
│       ├── tsconfig/
│       │   ├── base.json
│       │   ├── react.json
│       │   └── node.json
│       └── tailwind/
└── infra/                        # Terraform modules (or AWS CDK)
    ├── modules/
    │   ├── ecs/
    │   ├── rds/
    │   ├── cognito/
    │   ├── redis/
    │   └── s3/
    ├── environments/
    │   ├── dev/
    │   ├── staging/
    │   └── production/
    └── main.tf

Key workspace conventions:


3. Frontend Stack

Core Libraries

Concern Library Version
Framework React 18.x
Language TypeScript 5.x
Build tool Vite 5.x
Component system shadcn/ui (Radix UI + Tailwind) latest
Styling Tailwind CSS v3.x
Routing React Router v6.x
Server state TanStack Query (React Query) v5.x
Client state Zustand v4.x
Forms React Hook Form + Zod latest
Date handling date-fns v3.x
Animations Framer Motion v11.x
Icons Lucide React latest
PWA vite-plugin-pwa (Workbox) latest

Architecture Decisions

Route-based code splitting. All top-level routes are lazy-loaded via React.lazy + Suspense. Each feature area (dashboard, calendar, sessions, patients, comms, reports) is a separate chunk. This keeps the initial bundle under 200 KB gzipped.

// apps/web/src/router.tsx
const Dashboard = React.lazy(() => import('./features/dashboard'));
const Calendar = React.lazy(() => import('./features/calendar'));
const Sessions = React.lazy(() => import('./features/sessions'));

TanStack Query as the data layer. All server data (patients, sessions, notes) lives in the React Query cache. Mutations use onMutate + onError rollback for optimistic UI updates — especially for note saving and calendar event drag-and-drop.

Zustand for client-only state. Used for: active session state (recording in progress, transcription buffer), sidebar collapse state, notification panel open/close, and ephemeral form drafts. Zustand stores are colocated with their feature folder and are not persisted to localStorage except for non-sensitive UI preferences.

Real-time. A singleton WebSocket connection is established on login (socket.io-client or native WebSocket). The connection subscribes to clinic:{clinicId} and user:{userId} channels. Events emitted from the server:

A Server-Sent Events fallback (EventSource) is used for one-way feeds (alerts, transcription) if the WebSocket connection fails.

Feature flags. On app boot, a lightweight GET /api/config endpoint returns a features map keyed by clinic ID. This controls Progressive rollout of beta features (e.g. AI summarisation, WhatsApp integration). The flag map is stored in Zustand and consumed by a useFeatureFlag(key) hook.

PWA. The vite-plugin-pwa plugin generates a service worker using Workbox. Cache strategy:


4. Backend Stack

Core Libraries

Concern Library
Runtime Node.js 20 LTS
Language TypeScript 5.x
HTTP framework Fastify v4
ORM Prisma v5
Database PostgreSQL 15
Validation Zod
Job queues BullMQ + ioredis
Auth middleware @fastify/jwt + Cognito JWKS
API docs fastify-swagger + @fastify/swagger-ui
AI Anthropic SDK / OpenAI SDK
HTTP client undici (built into Node 18+)
Logging pino (Fastify default)

API Design Principles

RESTful JSON with OpenAPI 3.1. All routes are documented via Fastify's schema system. The OpenAPI spec is auto-generated at GET /api/docs/openapi.json and served via Swagger UI in non-production environments.

Consistent error format:

{
  "code": "PATIENT_NOT_FOUND",
  "message": "No patient found with the given ID",
  "details": { "patientId": "abc-123" }
}

Cursor-based pagination for all list endpoints:

{
  "data": [...],
  "pagination": {
    "nextCursor": "eyJpZCI6IjEyMyJ9",
    "hasMore": true,
    "limit": 20
  }
}

Multi-tenancy enforcement. Every Fastify route handler receives a clinicId extracted from the verified JWT. All Prisma queries include where: { clinicId } as a hard constraint. A custom Fastify plugin (clinicScope.plugin.ts) injects this at the request lifecycle level — it is not left to individual route handlers to remember.

// Example scoped query
async function getPatients(clinicId: string, cursor?: string) {
  return prisma.patient.findMany({
    where: { clinicId, status: 'ACTIVE' },
    cursor: cursor ? { id: cursor } : undefined,
    take: 21,
    orderBy: { createdAt: 'desc' },
  });
}

5. Authentication & Authorisation

Flow

  1. User navigates to {clinic}.sherpii.io
  2. Frontend resolves the clinicId from the subdomain via GET /api/clinics/resolve?subdomain={subdomain}
  3. Frontend redirects to the Cognito Hosted UI with PKCE (code_challenge, code_verifier)
  4. After successful login, Cognito redirects back with an authorisation code
  5. Frontend exchanges the code for tokens (/oauth2/token) and receives:
    • access_token (JWT, stored in memory only)
    • id_token (JWT, stored in memory only)
    • refresh_token (stored in an httpOnly, Secure, SameSite=Strict cookie)
  6. Every API request includes Authorization: Bearer {access_token}
  7. The API validates the JWT signature against Cognito's JWKS endpoint (/.well-known/jwks.json) using @fastify/jwt. The JWKS is cached in memory with a 12-hour TTL.

No session state on the server. The API is fully stateless. The refresh_token cookie is handled by a dedicated POST /api/auth/refresh endpoint which exchanges it with Cognito and returns a new access token.

Roles & Permissions

Role Description
CAREGIVER Sees own patients and sessions only
SUPERVISOR Sees all patients within the clinic; can review caregiver notes
ADMIN Full clinic access; manages users, settings, integrations

Roles are stored as Cognito group memberships and mirrored in the User.role column in PostgreSQL for fast query-time access checks. A requireRole(roles: Role[]) Fastify hook enforces role access per route group.

Row-Level Security

Beyond role checks, all queries are scoped:

This is enforced in a service layer, not just in route handlers, to prevent accidental data leaks via internal service calls.


6. Database Schema (Prisma)

model Clinic {
  id        String   @id @default(cuid())
  name      String
  subdomain String   @unique
  plan      String   @default("starter")
  settings  Json     @default("{}")
  createdAt DateTime @default(now())

  users    User[]
  patients Patient[]
}

model User {
  id          String   @id @default(cuid())
  clinicId    String
  cognitoSub  String   @unique
  role        Role
  name        String
  email       String
  preferences Json     @default("{}")
  createdAt   DateTime @default(now())

  clinic   Clinic    @relation(fields: [clinicId], references: [id])
  patients Patient[]
  sessions Session[]
  alerts   Alert[]
}

model Patient {
  id            String        @id @default(cuid())
  clinicId      String
  caregiverId   String
  name          String
  dob           DateTime?
  diagnosis     String?
  status        PatientStatus @default(ACTIVE)
  contacts      Json          @default("[]")
  consentStatus ConsentStatus @default(PENDING)
  createdAt     DateTime      @default(now())

  clinic    Clinic     @relation(fields: [clinicId], references: [id])
  caregiver User       @relation(fields: [caregiverId], references: [id])
  sessions  Session[]
  notes     Note[]
  signals   Signal[]
  homework  Homework[]
  messages  Message[]
  alerts    Alert[]
  reports   Report[]
}

model Session {
  id           String        @id @default(cuid())
  patientId    String
  clinicId     String
  caregiverId  String
  scheduledAt  DateTime
  startedAt    DateTime?
  endedAt      DateTime?
  type         SessionType
  status       SessionStatus @default(SCHEDULED)
  recordingUrl String?
  transcriptUrl String?
  createdAt    DateTime      @default(now())

  patient   Patient @relation(fields: [patientId], references: [id])
  caregiver User    @relation(fields: [caregiverId], references: [id])
  notes     Note[]
}

model Note {
  id         String   @id @default(cuid())
  sessionId  String
  patientId  String
  clinicId   String
  templateId String?
  body       Json     @default("{}")
  summary    String?
  createdAt  DateTime @default(now())
  updatedAt  DateTime @updatedAt

  session Session @relation(fields: [sessionId], references: [id])
  patient Patient @relation(fields: [patientId], references: [id])
}

model Signal {
  id          String     @id @default(cuid())
  patientId   String
  clinicId    String
  type        SignalType
  value       Float
  measuredAt  DateTime
  metadata    Json       @default("{}")

  patient Patient @relation(fields: [patientId], references: [id])
}

model Homework {
  id             String         @id @default(cuid())
  patientId      String
  clinicId       String
  description    String
  assignedAt     DateTime       @default(now())
  dueAt          DateTime?
  completedAt    DateTime?
  followUpRule   Json           @default("{}")
  status         HomeworkStatus @default(PENDING)

  patient Patient @relation(fields: [patientId], references: [id])
}

model Message {
  id        String          @id @default(cuid())
  patientId String
  clinicId  String
  direction MessageDirection
  channel   MessageChannel
  content   String
  metadata  Json            @default("{}")
  sentAt    DateTime        @default(now())
  readAt    DateTime?

  patient Patient @relation(fields: [patientId], references: [id])
}

model Alert {
  id              String      @id @default(cuid())
  clinicId        String
  caregiverId     String
  type            AlertType
  priority        Priority
  patientId       String?
  status          AlertStatus @default(OPEN)
  escalationStep  Int         @default(0)
  createdAt       DateTime    @default(now())
  acknowledgedAt  DateTime?

  caregiver User    @relation(fields: [caregiverId], references: [id])
  patient   Patient? @relation(fields: [patientId], references: [id])
}

model Report {
  id          String       @id @default(cuid())
  patientId   String
  clinicId    String
  type        ReportType
  generatedAt DateTime     @default(now())
  pdfUrl      String?
  sentTo      Json         @default("[]")
  status      ReportStatus @default(PENDING)

  patient Patient @relation(fields: [patientId], references: [id])
}

model AuditLog {
  id           String   @id @default(cuid())
  clinicId     String
  userId       String
  action       String
  resourceType String
  resourceId   String
  metadata     Json     @default("{}")
  timestamp    DateTime @default(now())

  @@index([clinicId, timestamp])
  @@index([resourceType, resourceId])
}

enum Role            { CAREGIVER SUPERVISOR ADMIN }
enum PatientStatus   { ACTIVE INACTIVE DISCHARGED }
enum ConsentStatus   { PENDING GIVEN WITHDRAWN }
enum SessionType     { INDIVIDUAL GROUP INTAKE CRISIS }
enum SessionStatus   { SCHEDULED IN_PROGRESS COMPLETED CANCELLED NO_SHOW }
enum SignalType      { PHQ9 GAD7 MOOD CUSTOM }
enum HomeworkStatus  { PENDING IN_PROGRESS COMPLETED OVERDUE CANCELLED }
enum MessageDirection{ IN OUT }
enum MessageChannel  { EMAIL WHATSAPP TELEGRAM SMS VOICE }
enum AlertType       { CRISIS OVERDUE_HOMEWORK MISSED_SESSION SIGNAL_THRESHOLD }
enum Priority        { LOW MEDIUM HIGH CRITICAL }
enum AlertStatus     { OPEN ACKNOWLEDGED RESOLVED ESCALATED }
enum ReportType      { SESSION_SUMMARY PROGRESS DISCHARGE INTAKE }
enum ReportStatus    { PENDING GENERATING COMPLETE FAILED }

Index strategy:


7. Integrations

7.1 Calendar

Google Calendar

Microsoft Outlook / Exchange

7.2 Email

7.3 WhatsApp

7.4 Telegram

7.5 Voice

7.6 Transcription

Live streaming (in-session)

  1. Browser MediaRecorder API captures audio chunks (16 kHz, mono, 250 ms intervals)
  2. Chunks are sent over the WebSocket connection as binary ArrayBuffer
  3. The API server pipes chunks to Amazon Transcribe Streaming via the @aws-sdk/client-transcribe-streaming SDK
  4. Partial and full transcripts are pushed back over the WebSocket to the frontend for the live transcription panel

Post-session (async)

  1. Full recording is uploaded to S3 (recordings/{clinicId}/{sessionId}.webm)
  2. S3 ObjectCreated event triggers an SQS message
  3. transcription-queue BullMQ worker picks up the job, calls Amazon Transcribe (standard) or Whisper API
  4. Completed transcript stored in Session.transcriptUrl (S3 JSON) and a plain-text version cached in the DB

Language support: Hebrew, English, and Arabic via Amazon Transcribe Standard. For clinical terminology accuracy in Hebrew, Amazon Transcribe Medical is evaluated per clinic. Whisper large-v3 is the fallback for languages or accents with lower Amazon accuracy.

7.7 AI Summarisation


8. Background Jobs (BullMQ + Redis)

All queues use BullMQ backed by ElastiCache Redis. The worker process is a separate ECS Fargate task from the API server, sharing the same Docker image but with a different CMD (node dist/worker.js).

Queue Definitions

Queue Trigger Job
transcription-queue Session endedAt set Fetch audio from S3 → call Transcribe → store transcript
summary-queue Transcript available OR note saved manually Fetch notes + transcript → call AI API → store draft → notify caregiver via WS
followup-queue Homework dueAt passed Check completedAt → send reminder if overdue → reschedule check based on followUpRule
report-queue Report requested Render HTML → Puppeteer PDF → upload to S3 → send email
alert-escalation-queue Alert created or escalation step scheduled Check acknowledgedAt → if no ack within SLA, fire next escalation step (in-app → email → voice call)
message-queue Outbound message created Rate-limit check → call channel API (Twilio/SES/Telegram) → update Message.sentAt

Retry & Error Handling

const defaultJobOptions: DefaultJobOptions = {
  attempts: 3,
  backoff: {
    type: 'exponential',
    delay: 2000, // 2s, 4s, 8s
  },
  removeOnComplete: { count: 100 },
  removeOnFail: false, // keep failed jobs for inspection
};

Failed jobs after max retries are moved to a dead-letter queue ({queue-name}:failed). A separate monitoring job scans dead-letter queues every 5 minutes and fires a Slack notification + CloudWatch alarm if the count exceeds a threshold.


9. Real-time

Server-side: socket.io v4 with the Redis adapter (@socket.io/redis-adapter) so that WebSocket events are broadcast correctly across multiple ECS API task instances.

Connection lifecycle:

  1. Client connects on login, sends AUTH event with the access token
  2. Server validates the JWT and joins the socket to rooms: clinic:{clinicId} and user:{userId}
  3. Caregiver starting a session joins session:{sessionId} for transcription events

Events:

Event Direction Payload
transcription:chunk Server → client { sessionId, text, isFinal }
alert:new Server → client { alertId, type, priority, patientName }
alert:updated Server → client { alertId, status }
session:updated Server → client { sessionId, updatedBy }
dashboard:stats Server → client { activeAlerts, todaySessions }

SSE fallback: If WebSocket fails (corporate proxies, etc.), the client falls back to EventSource on GET /api/sse/alerts and GET /api/sse/transcription/{sessionId}. These are one-way feeds and cover the most critical real-time use cases.


10. Infrastructure (AWS)

Compute

Data

Storage & Delivery

Auth & Routing

Async / Eventing

CI/CD (GitHub Actions)

pull_request:
  1. pnpm install
  2. biome lint + type-check (tsc --noEmit)
  3. vitest run (unit + integration with Docker Compose PostgreSQL)
  4. playwright (E2E, staging environment)

push to main:
  1-4 above
  5. docker buildx build → push to ECR (tagged with git SHA)
  6. terraform plan (PR comment) / terraform apply (main only)
  7. ecs update-service --force-new-deployment (blue/green via ECS circuit breaker)

Environments:


11. Security & Compliance

HIPAA (US Patients)

GDPR (EU Patients)

Application Security

Threat Mitigation
SQL injection Prisma parameterised queries (no raw SQL except reviewed migrations)
XSS React's default escaping; strict CSP headers (script-src 'self')
CSRF SameSite=Strict cookies; Origin header validation on state-changing requests
Rate limiting Per-IP: 100 req/min; Per-user: 1000 req/min; per-endpoint overrides for auth routes (10 req/min) via @fastify/rate-limit
Secrets leakage All secrets in AWS Secrets Manager; injected as env vars at ECS task definition time; no .env files in the repo
Dependency vulnerabilities Dependabot weekly PRs + pnpm audit in CI as a blocking check
Auth bypass JWT audience + issuer validated; short access token TTL (15 min); refresh token rotation

A third-party penetration test is scheduled in the final week before the Phase A launch.


12. Observability

Logging

Tracing

Metrics (Custom CloudWatch)

Metric Alarm Threshold
API/ErrorRate5xx > 1% over 5 min → PagerDuty P2
Worker/QueueDepth[transcription] > 50 jobs → Slack warning
Worker/QueueDepth[alert-escalation] > 10 jobs → PagerDuty P1
AI/SummaryLatencyP99 > 10 s → Slack warning
DB/ConnectionCount > 80% of max → PagerDuty P2

Frontend Error Tracking


13. Testing Strategy

Unit Tests (Vitest)

Integration Tests (Supertest + Fastify test harness)

E2E Tests (Playwright)

Contract Tests

Visual Regression (Storybook + Chromatic)

Coverage Targets

Area Target
Domain logic (services, validators, prompt builders) 80% line coverage
API route handlers 70% (via integration tests)
UI components 60% (Storybook stories + unit tests)
E2E critical paths 100% of the 5 defined paths green

14. Phase A vs Phase B Scope

Phase A — Launch (Months 1–4)

All core platform screens and flows:

Phase A additional modules:

Phase B — 6–12 Month Roadmap


15. Estimated Team

Role Count Notes
Senior full-stack engineer (React + Node) 2 Own features end-to-end; one takes tech-lead role
DevOps / infrastructure engineer 1 AWS, Terraform, CI/CD, security hardening
Product designer 1 Figma designs handed off to packages/ui as shadcn/ui components
QA engineer (part-time) 1 E2E test authoring in Playwright, exploratory testing, release sign-off

Recommended build timeline: approximately 4 months to Phase A MVP with this team composition, assuming:

Milestones:

Week Milestone
1–2 Monorepo scaffold, infra bootstrap (ECS, RDS, Cognito), auth flow end-to-end
3–5 Core data models (Prisma), patients CRUD, sessions CRUD, calendar sync (Google)
6–8 Note editor, live transcription (WebSocket), AI summary flow
9–11 Communications (WhatsApp, Telegram, email), alert engine, escalation queues
12–14 Reports (PDF), assessments, intake forms, consent, Phase A audit + security hardening
15–16 E2E testing, pen test remediation, performance tuning, production launch