Architecture, stack decisions, and engineering guidelines.
Sherpii is a multi-tenant, multi-caregiver clinic management platform designed to support mental health and allied health clinics of varying sizes. Each clinic operates as an isolated tenant with its own subdomain, user roles, patient data, and configuration. The platform consolidates session management, clinical note-taking, AI-assisted summarisation, patient communications, homework tracking, and escalation alerting into a single cohesive product.
The core tech stack is React 18 / TypeScript with shadcn/ui on the front end, a Node.js 20 / TypeScript API server on the back end, PostgreSQL as the primary relational database managed via Prisma ORM, and AWS Cognito for identity and authentication. All infrastructure runs on AWS and is provisioned as code using Terraform or AWS CDK. The monorepo is managed with pnpm workspaces.
The project is a pnpm workspace monorepo. All packages share a root-level pnpm-workspace.yaml and a root package.json for scripts and shared dev dependencies.
sherpii/
├── pnpm-workspace.yaml
├── package.json # root scripts, shared devDeps
├── biome.json # or .eslintrc.js at root
├── apps/
│ ├── web/ # React/Vite frontend PWA
│ │ ├── src/
│ │ ├── public/
│ │ ├── vite.config.ts
│ │ └── package.json
│ └── api/ # Fastify API server (Node.js, TypeScript)
│ ├── src/
│ │ ├── routes/
│ │ ├── services/
│ │ ├── jobs/
│ │ ├── plugins/
│ │ └── server.ts
│ └── package.json
├── packages/
│ ├── ui/ # Shared shadcn/ui component library
│ │ ├── src/components/
│ │ ├── src/index.ts
│ │ └── package.json
│ ├── types/ # Shared TypeScript types + Zod schemas
│ │ ├── src/
│ │ └── package.json
│ ├── db/ # Prisma client + migrations
│ │ ├── prisma/
│ │ │ ├── schema.prisma
│ │ │ └── migrations/
│ │ ├── src/
│ │ └── package.json
│ └── config/ # Shared configs: eslint, tsconfig, tailwind
│ ├── eslint/
│ ├── tsconfig/
│ │ ├── base.json
│ │ ├── react.json
│ │ └── node.json
│ └── tailwind/
└── infra/ # Terraform modules (or AWS CDK)
├── modules/
│ ├── ecs/
│ ├── rds/
│ ├── cognito/
│ ├── redis/
│ └── s3/
├── environments/
│ ├── dev/
│ ├── staging/
│ └── production/
└── main.tf
Key workspace conventions:
packages/types and packages/db are consumed by both apps/web and apps/apipackages/ui is consumed only by apps/web (and potentially a future patient-facing PWA)"sherpii-*" naming (e.g. sherpii-ui, sherpii-types)| Concern | Library | Version |
|---|---|---|
| Framework | React | 18.x |
| Language | TypeScript | 5.x |
| Build tool | Vite | 5.x |
| Component system | shadcn/ui (Radix UI + Tailwind) | latest |
| Styling | Tailwind CSS | v3.x |
| Routing | React Router | v6.x |
| Server state | TanStack Query (React Query) | v5.x |
| Client state | Zustand | v4.x |
| Forms | React Hook Form + Zod | latest |
| Date handling | date-fns | v3.x |
| Animations | Framer Motion | v11.x |
| Icons | Lucide React | latest |
| PWA | vite-plugin-pwa (Workbox) |
latest |
Route-based code splitting. All top-level routes are lazy-loaded via React.lazy + Suspense. Each feature area (dashboard, calendar, sessions, patients, comms, reports) is a separate chunk. This keeps the initial bundle under 200 KB gzipped.
// apps/web/src/router.tsx
const Dashboard = React.lazy(() => import('./features/dashboard'));
const Calendar = React.lazy(() => import('./features/calendar'));
const Sessions = React.lazy(() => import('./features/sessions'));
TanStack Query as the data layer. All server data (patients, sessions, notes) lives in the React Query cache. Mutations use onMutate + onError rollback for optimistic UI updates — especially for note saving and calendar event drag-and-drop.
Zustand for client-only state. Used for: active session state (recording in progress, transcription buffer), sidebar collapse state, notification panel open/close, and ephemeral form drafts. Zustand stores are colocated with their feature folder and are not persisted to localStorage except for non-sensitive UI preferences.
Real-time. A singleton WebSocket connection is established on login (socket.io-client or native WebSocket). The connection subscribes to clinic:{clinicId} and user:{userId} channels. Events emitted from the server:
transcription:chunk — partial transcript text for the live transcription panelalert:new — new alert triggers an in-app notification badgesession:updated — another caregiver edited a shared sessionA Server-Sent Events fallback (EventSource) is used for one-way feeds (alerts, transcription) if the WebSocket connection fails.
Feature flags. On app boot, a lightweight GET /api/config endpoint returns a features map keyed by clinic ID. This controls Progressive rollout of beta features (e.g. AI summarisation, WhatsApp integration). The flag map is stored in Zustand and consumed by a useFeatureFlag(key) hook.
PWA. The vite-plugin-pwa plugin generates a service worker using Workbox. Cache strategy:
CacheFirstNetworkFirst with a 5-second timeout fallbackCacheFirst with 30-day expiry| Concern | Library |
|---|---|
| Runtime | Node.js 20 LTS |
| Language | TypeScript 5.x |
| HTTP framework | Fastify v4 |
| ORM | Prisma v5 |
| Database | PostgreSQL 15 |
| Validation | Zod |
| Job queues | BullMQ + ioredis |
| Auth middleware | @fastify/jwt + Cognito JWKS |
| API docs | fastify-swagger + @fastify/swagger-ui |
| AI | Anthropic SDK / OpenAI SDK |
| HTTP client | undici (built into Node 18+) |
| Logging | pino (Fastify default) |
RESTful JSON with OpenAPI 3.1. All routes are documented via Fastify's schema system. The OpenAPI spec is auto-generated at GET /api/docs/openapi.json and served via Swagger UI in non-production environments.
Consistent error format:
{
"code": "PATIENT_NOT_FOUND",
"message": "No patient found with the given ID",
"details": { "patientId": "abc-123" }
}
Cursor-based pagination for all list endpoints:
{
"data": [...],
"pagination": {
"nextCursor": "eyJpZCI6IjEyMyJ9",
"hasMore": true,
"limit": 20
}
}
Multi-tenancy enforcement. Every Fastify route handler receives a clinicId extracted from the verified JWT. All Prisma queries include where: { clinicId } as a hard constraint. A custom Fastify plugin (clinicScope.plugin.ts) injects this at the request lifecycle level — it is not left to individual route handlers to remember.
// Example scoped query
async function getPatients(clinicId: string, cursor?: string) {
return prisma.patient.findMany({
where: { clinicId, status: 'ACTIVE' },
cursor: cursor ? { id: cursor } : undefined,
take: 21,
orderBy: { createdAt: 'desc' },
});
}
{clinic}.sherpii.ioclinicId from the subdomain via GET /api/clinics/resolve?subdomain={subdomain}code_challenge, code_verifier)/oauth2/token) and receives:access_token (JWT, stored in memory only)id_token (JWT, stored in memory only)refresh_token (stored in an httpOnly, Secure, SameSite=Strict cookie)Authorization: Bearer {access_token}/.well-known/jwks.json) using @fastify/jwt. The JWKS is cached in memory with a 12-hour TTL.No session state on the server. The API is fully stateless. The refresh_token cookie is handled by a dedicated POST /api/auth/refresh endpoint which exchanges it with Cognito and returns a new access token.
| Role | Description |
|---|---|
CAREGIVER |
Sees own patients and sessions only |
SUPERVISOR |
Sees all patients within the clinic; can review caregiver notes |
ADMIN |
Full clinic access; manages users, settings, integrations |
Roles are stored as Cognito group memberships and mirrored in the User.role column in PostgreSQL for fast query-time access checks. A requireRole(roles: Role[]) Fastify hook enforces role access per route group.
Beyond role checks, all queries are scoped:
WHERE clinicId = $1 AND caregiverId = $2WHERE clinicId = $1This is enforced in a service layer, not just in route handlers, to prevent accidental data leaks via internal service calls.
model Clinic {
id String @id @default(cuid())
name String
subdomain String @unique
plan String @default("starter")
settings Json @default("{}")
createdAt DateTime @default(now())
users User[]
patients Patient[]
}
model User {
id String @id @default(cuid())
clinicId String
cognitoSub String @unique
role Role
name String
email String
preferences Json @default("{}")
createdAt DateTime @default(now())
clinic Clinic @relation(fields: [clinicId], references: [id])
patients Patient[]
sessions Session[]
alerts Alert[]
}
model Patient {
id String @id @default(cuid())
clinicId String
caregiverId String
name String
dob DateTime?
diagnosis String?
status PatientStatus @default(ACTIVE)
contacts Json @default("[]")
consentStatus ConsentStatus @default(PENDING)
createdAt DateTime @default(now())
clinic Clinic @relation(fields: [clinicId], references: [id])
caregiver User @relation(fields: [caregiverId], references: [id])
sessions Session[]
notes Note[]
signals Signal[]
homework Homework[]
messages Message[]
alerts Alert[]
reports Report[]
}
model Session {
id String @id @default(cuid())
patientId String
clinicId String
caregiverId String
scheduledAt DateTime
startedAt DateTime?
endedAt DateTime?
type SessionType
status SessionStatus @default(SCHEDULED)
recordingUrl String?
transcriptUrl String?
createdAt DateTime @default(now())
patient Patient @relation(fields: [patientId], references: [id])
caregiver User @relation(fields: [caregiverId], references: [id])
notes Note[]
}
model Note {
id String @id @default(cuid())
sessionId String
patientId String
clinicId String
templateId String?
body Json @default("{}")
summary String?
createdAt DateTime @default(now())
updatedAt DateTime @updatedAt
session Session @relation(fields: [sessionId], references: [id])
patient Patient @relation(fields: [patientId], references: [id])
}
model Signal {
id String @id @default(cuid())
patientId String
clinicId String
type SignalType
value Float
measuredAt DateTime
metadata Json @default("{}")
patient Patient @relation(fields: [patientId], references: [id])
}
model Homework {
id String @id @default(cuid())
patientId String
clinicId String
description String
assignedAt DateTime @default(now())
dueAt DateTime?
completedAt DateTime?
followUpRule Json @default("{}")
status HomeworkStatus @default(PENDING)
patient Patient @relation(fields: [patientId], references: [id])
}
model Message {
id String @id @default(cuid())
patientId String
clinicId String
direction MessageDirection
channel MessageChannel
content String
metadata Json @default("{}")
sentAt DateTime @default(now())
readAt DateTime?
patient Patient @relation(fields: [patientId], references: [id])
}
model Alert {
id String @id @default(cuid())
clinicId String
caregiverId String
type AlertType
priority Priority
patientId String?
status AlertStatus @default(OPEN)
escalationStep Int @default(0)
createdAt DateTime @default(now())
acknowledgedAt DateTime?
caregiver User @relation(fields: [caregiverId], references: [id])
patient Patient? @relation(fields: [patientId], references: [id])
}
model Report {
id String @id @default(cuid())
patientId String
clinicId String
type ReportType
generatedAt DateTime @default(now())
pdfUrl String?
sentTo Json @default("[]")
status ReportStatus @default(PENDING)
patient Patient @relation(fields: [patientId], references: [id])
}
model AuditLog {
id String @id @default(cuid())
clinicId String
userId String
action String
resourceType String
resourceId String
metadata Json @default("{}")
timestamp DateTime @default(now())
@@index([clinicId, timestamp])
@@index([resourceType, resourceId])
}
enum Role { CAREGIVER SUPERVISOR ADMIN }
enum PatientStatus { ACTIVE INACTIVE DISCHARGED }
enum ConsentStatus { PENDING GIVEN WITHDRAWN }
enum SessionType { INDIVIDUAL GROUP INTAKE CRISIS }
enum SessionStatus { SCHEDULED IN_PROGRESS COMPLETED CANCELLED NO_SHOW }
enum SignalType { PHQ9 GAD7 MOOD CUSTOM }
enum HomeworkStatus { PENDING IN_PROGRESS COMPLETED OVERDUE CANCELLED }
enum MessageDirection{ IN OUT }
enum MessageChannel { EMAIL WHATSAPP TELEGRAM SMS VOICE }
enum AlertType { CRISIS OVERDUE_HOMEWORK MISSED_SESSION SIGNAL_THRESHOLD }
enum Priority { LOW MEDIUM HIGH CRITICAL }
enum AlertStatus { OPEN ACKNOWLEDGED RESOLVED ESCALATED }
enum ReportType { SESSION_SUMMARY PROGRESS DISCHARGE INTAKE }
enum ReportStatus { PENDING GENERATING COMPLETE FAILED }
Index strategy:
Patient: index on (clinicId, caregiverId, status)Session: index on (clinicId, caregiverId, scheduledAt)Message: index on (clinicId, patientId, sentAt)Alert: index on (clinicId, caregiverId, status, priority)AuditLog: compound index on (clinicId, timestamp) for compliance reportingGoogle Calendar
googleapis Node.js SDKcalendar.events, calendar.readonlyUser.preferences (or a dedicated CalendarToken table)watch channel on the calendar. Google sends change notifications to POST /api/webhooks/google-calendar. On notification, fetch the changed event and upsert into the Session table.fuse.js). Unmatched events are surfaced to the caregiver for one-time confirmation.Microsoft Outlook / Exchange
@microsoft/microsoft-graph-client@azure/msal-node (MSAL)Calendars.Read, Calendars.ReadWrite/subscriptions) with a webhook endpointmail.sherpii.io)@react-email/components for type-safe HTML email templates compiled to static HTMLPOST /api/webhooks/twilio-whatsapp → verify Twilio signature → parse Body and From → look up Patient by phone number → create Message record → if within 24-hour service window, optionally trigger AI auto-reply draft for caregiver reviewmessage-queue in BullMQ. Rate limiting: max 10 messages per patient per hour. Template messages (pre-approved by Meta) used for all scheduled outbound touchpoints.grammy (TypeScript-first Telegram Bot API framework)start token: https://t.me/SherpiiBot?start={clinicToken}_{patientToken}. On /start, the bot resolves the patient and stores the chatId against the Patient record.POST /api/webhooks/telegram) → route to patient → create Message recordInlineKeyboard for structured one-tap responses. Button callbacks update the relevant Homework or Signal record.<Say> and <Gather> verbs deliver pre-recorded or TTS reminders. Patient key-press responses (e.g. "press 1 to confirm") are captured and stored.Alert reaches escalationStep >= 2, a voice call is placed to the caregiver's registered mobile number using a TwiML Bin with the patient name and alert type.Patient.consentStatus. Recordings stored in S3 with server-side encryption.Live streaming (in-session)
MediaRecorder API captures audio chunks (16 kHz, mono, 250 ms intervals)ArrayBuffer@aws-sdk/client-transcribe-streaming SDKPost-session (async)
recordings/{clinicId}/{sessionId}.webm)ObjectCreated event triggers an SQS messagetranscription-queue BullMQ worker picks up the job, calls Amazon Transcribe (standard) or Whisper APISession.transcriptUrl (S3 JSON) and a plain-text version cached in the DBLanguage support: Hebrew, English, and Arabic via Amazon Transcribe Standard. For clinical terminology accuracy in Hebrew, Amazon Transcribe Medical is evaluated per clinic. Whisper large-v3 is the fallback for languages or accents with lower Amazon accuracy.
[System prompt — clinical tone, privacy constraints, output format]
[Clinic context — modality, treatment model, caregiver name] ← cached
[Patient context — diagnosis, recent history, last session summary]
[Current session — structured notes JSON + transcript]
[Instruction — produce: Presenting Concern / Key Themes / Plan / Homework]
cache_control parameter is applied to the system prompt + clinic context block. This reduces latency on the second+ summary for a given caregiver from ~4 s to ~1.5 s and cuts token costs by ~75% for the cached prefix.summary_draft field on the Note record. The caregiver sees a diff-style review UI. Note.summary is only set after explicit caregiver approval. AI output is never auto-persisted.AuditLog with action: AI_SUMMARY_GENERATED.All queues use BullMQ backed by ElastiCache Redis. The worker process is a separate ECS Fargate task from the API server, sharing the same Docker image but with a different CMD (node dist/worker.js).
| Queue | Trigger | Job |
|---|---|---|
transcription-queue |
Session endedAt set |
Fetch audio from S3 → call Transcribe → store transcript |
summary-queue |
Transcript available OR note saved manually | Fetch notes + transcript → call AI API → store draft → notify caregiver via WS |
followup-queue |
Homework dueAt passed |
Check completedAt → send reminder if overdue → reschedule check based on followUpRule |
report-queue |
Report requested | Render HTML → Puppeteer PDF → upload to S3 → send email |
alert-escalation-queue |
Alert created or escalation step scheduled | Check acknowledgedAt → if no ack within SLA, fire next escalation step (in-app → email → voice call) |
message-queue |
Outbound message created | Rate-limit check → call channel API (Twilio/SES/Telegram) → update Message.sentAt |
const defaultJobOptions: DefaultJobOptions = {
attempts: 3,
backoff: {
type: 'exponential',
delay: 2000, // 2s, 4s, 8s
},
removeOnComplete: { count: 100 },
removeOnFail: false, // keep failed jobs for inspection
};
Failed jobs after max retries are moved to a dead-letter queue ({queue-name}:failed). A separate monitoring job scans dead-letter queues every 5 minutes and fires a Slack notification + CloudWatch alarm if the count exceeds a threshold.
Server-side: socket.io v4 with the Redis adapter (@socket.io/redis-adapter) so that WebSocket events are broadcast correctly across multiple ECS API task instances.
Connection lifecycle:
AUTH event with the access tokenclinic:{clinicId} and user:{userId}session:{sessionId} for transcription eventsEvents:
| Event | Direction | Payload |
|---|---|---|
transcription:chunk |
Server → client | { sessionId, text, isFinal } |
alert:new |
Server → client | { alertId, type, priority, patientName } |
alert:updated |
Server → client | { alertId, status } |
session:updated |
Server → client | { sessionId, updatedBy } |
dashboard:stats |
Server → client | { activeAlerts, todaySessions } |
SSE fallback: If WebSocket fails (corporate proxies, etc.), the client falls back to EventSource on GET /api/sse/alerts and GET /api/sse/transcription/{sessionId}. These are one-way feeds and cover the most critical real-time use cases.
sherpii-api: Fastify server, auto-scaled on CPU + request count (min 2, max 10 tasks)sherpii-worker: BullMQ worker, auto-scaled on Redis queue depth (custom CloudWatch metric)db.t4g.medium for staging, db.r7g.large for production. Automated daily snapshots retained for 30 days.cache.t4g.medium for staging, cluster mode with 2 shards for production. Used by BullMQ and socket.io adapter.sherpii-recordings-{env} — audio files (encrypted, SSE-KMS)sherpii-reports-{env} — generated PDF reports (SSE-S3)sherpii-documents-{env} — patient intake forms, consent docsauth.sherpii.io. App clients per environment. Hosted UI with clinic-branded CSS override.*.sherpii.io → CloudFront for clinic subdomains. The frontend resolves clinic context from the subdomain on boot.pull_request:
1. pnpm install
2. biome lint + type-check (tsc --noEmit)
3. vitest run (unit + integration with Docker Compose PostgreSQL)
4. playwright (E2E, staging environment)
push to main:
1-4 above
5. docker buildx build → push to ECR (tagged with git SHA)
6. terraform plan (PR comment) / terraform apply (main only)
7. ecs update-service --force-new-deployment (blue/green via ECS circuit breaker)
Environments:
dev — shared, always-on, seeded with demo datastaging — per-PR ephemeral environment spun up via Terraform workspace (destroyed 24 h after PR merge)production — promoted from main on manual approval gateELBSecurityPolicy-TLS13-1-2-2021-06)AuditLog record + a structured CloudTrail eventeu-west-1 (Ireland) deployment option is available; Terraform workspace parameter AWS_REGION controls thisDELETE /api/patients/{id} performs a soft-delete (sets status = DELETED, obfuscates PII fields). A nightly hard-delete pipeline permanently removes records older than the retention period (configurable per clinic)Patient.consentStatus tracks consent per data category (recording, AI analysis, third-party sharing) with timestamps| Threat | Mitigation |
|---|---|
| SQL injection | Prisma parameterised queries (no raw SQL except reviewed migrations) |
| XSS | React's default escaping; strict CSP headers (script-src 'self') |
| CSRF | SameSite=Strict cookies; Origin header validation on state-changing requests |
| Rate limiting | Per-IP: 100 req/min; Per-user: 1000 req/min; per-endpoint overrides for auth routes (10 req/min) via @fastify/rate-limit |
| Secrets leakage | All secrets in AWS Secrets Manager; injected as env vars at ECS task definition time; no .env files in the repo |
| Dependency vulnerabilities | Dependabot weekly PRs + pnpm audit in CI as a blocking check |
| Auth bypass | JWT audience + issuer validated; short access token TTL (15 min); refresh token rotation |
A third-party penetration test is scheduled in the final week before the Phase A launch.
/sherpii/api/{env}, /sherpii/worker/{env})requestId, clinicId, userId, route, statusCode, responseTimeMs, error (if applicable)aws-xray-sdk-node wraps the Fastify HTTP server, Prisma client, and external HTTP calls| Metric | Alarm Threshold |
|---|---|
API/ErrorRate5xx |
> 1% over 5 min → PagerDuty P2 |
Worker/QueueDepth[transcription] |
> 50 jobs → Slack warning |
Worker/QueueDepth[alert-escalation] |
> 10 jobs → PagerDuty P1 |
AI/SummaryLatencyP99 |
> 10 s → Slack warning |
DB/ConnectionCount |
> 80% of max → PagerDuty P2 |
#eng-alerts Slack channel*.test.ts / *.test.tsx@testing-library/react + Vitest + msw (Mock Service Worker) for API mocking$transaction rollback pattern to reset state between tests (fast)fastify-swagger strict mode)openapi-typescript) — type errors in CI catch schema driftpackages/ui has a Storybook with stories for every component| Area | Target |
|---|---|
| Domain logic (services, validators, prompt builders) | 80% line coverage |
| API route handlers | 70% (via integration tests) |
| UI components | 60% (Storybook stories + unit tests) |
| E2E critical paths | 100% of the 5 defined paths green |
All core platform screens and flows:
Phase A additional modules:
| Role | Count | Notes |
|---|---|---|
| Senior full-stack engineer (React + Node) | 2 | Own features end-to-end; one takes tech-lead role |
| DevOps / infrastructure engineer | 1 | AWS, Terraform, CI/CD, security hardening |
| Product designer | 1 | Figma designs handed off to packages/ui as shadcn/ui components |
| QA engineer (part-time) | 1 | E2E test authoring in Playwright, exploratory testing, release sign-off |
Recommended build timeline: approximately 4 months to Phase A MVP with this team composition, assuming:
Milestones:
| Week | Milestone |
|---|---|
| 1–2 | Monorepo scaffold, infra bootstrap (ECS, RDS, Cognito), auth flow end-to-end |
| 3–5 | Core data models (Prisma), patients CRUD, sessions CRUD, calendar sync (Google) |
| 6–8 | Note editor, live transcription (WebSocket), AI summary flow |
| 9–11 | Communications (WhatsApp, Telegram, email), alert engine, escalation queues |
| 12–14 | Reports (PDF), assessments, intake forms, consent, Phase A audit + security hardening |
| 15–16 | E2E testing, pen test remediation, performance tuning, production launch |