technical documentation specification

Hospital Information System (HIS) - Technical Architecture

A high-performance, distributed healthcare infrastructure comprised of 9+ decoupled microservices. The system enforces strict data privacy through PII-sanitized event choreography and ensures authoritative consistency via synchronous gRPC/Protobuf protocols. It implements advanced distributed patterns including Transactional Outbox, Sagas, and Two-Tier Hybrid Caching for high-throughput clinical workflows.

Spring Boot 3.4 Spring Cloud Gateway PostgreSQL 15 Apache Kafka (KRaft) Resilience4j gRPC / Protobuf Redis L1 Cache
services 9 Microservices + 1 API Gateway
edge_protocol REST / HTTP 1.1 + JWT
internal_protocol gRPC / HTTP 2 + Protobuf
event_bus Kafka PII-Clean Stream
01

System Architecture

High-Level Technical Topology

graph TB subgraph "Edge & Identity (Security Concern)" LB["Internet / LB"] GW["API Gateway
(Port: 4004)"] Auth["Auth Service
(JWT Security)"] end subgraph "Synchronous Clinical Core (gRPC Concern)" PS["Patient Service
(Port: 8080)"] DS["Doctor Service
(Port: 8083)"] AS["Appointment Service
(Port: 8084)"] end subgraph "Transactional Outbox Publishers" AD["Admission Service
(Port: 8086)"] SS["Support Service
(Port: 8085)"] end subgraph "Infrastructure Layer (Persistence)" Kafka[("Apache Kafka
(Event Stream)")] Redis[("Redis Shared Cache
(L1/L2 Store)")] end subgraph "Event-Driven Consumers" BI["Billing Service
(Invoicing)"] NO["Notification Service
(Dispatch)"] end %% Ingress Flow LB --> GW GW --> Auth GW -- "REST" --> AS GW -- "REST" --> AD GW -- "REST" --> PS GW -- "REST" --> DS %% Sync gRPC Connections AS -- "gRPC/9090" --> PS AS -- "gRPC/9005" --> DS AD -- "gRPC/9090" --> PS NO -- "gRPC Fallback" --> PS %% Reliability: Transactional Outbox Pattern AD -- "SQL Commit" --> AD_DB[(PostgreSQL)] SS -- "SQL Commit" --> SS_DB[(PostgreSQL)] AD_DB -- "Polling Relay" --> Kafka SS_DB -- "Polling Relay" --> Kafka %% Direct Events AS -- "Produce" --> Kafka %% Consumption Kafka --> BI Kafka --> NO %% Distributed Cache Utilization SS -.-> Redis AD -.-> Redis NO -.-> Redis
Service Domain Port Communication Path Infrastructure / Pattern Stack
api-gateway 4004 Edge Ingress Reactive WebFilter / JWT Stateless Auth
patient-service 8080 gRPC Server CQRS (Dual PostgreSQL Datasources)
doctor-service 8083 gRPC Server Provider Master Data / gRPC Registry
appointment-service 8084 gRPC Client / Kafka Transactional Write / Dynamic Constraint Check
billing-service 8081 Kafka Consumer Idempotent Sink / Ledger Persistence
support-service 8085 Redis / Outbox Polling Outbox Pattern / Redis L1 Cache
admission-service 8086 Redis / gRPC / Outbox Transactional Outbox / Multi-tier Redis Cache
notification-service 8090 Redis / Kafka / gRPC Hybrid Enrichment / Redis L1 Hydration
Architecture: Postgres Provisioning

The infrastructure utilizes a logical partitioning model. While services share a PostgreSQL cluster for development parity, they maintain strict schema-level isolation. Cross-domain data access is permitted only through gRPC for synchronous reads and Kafka for asynchronous synchronization.

02

Service Data Models

Each service exposes a REST API and persists its own entity graph. Below are the primary domain models per service.

patient-service :8080

Clinical Master Data (PMD). Implements CQRS with isolated write/read schemas and provides gRPC clinical history hydration.

id UUID
pmdRecord PatientRecord (gRPC)
clinicalFlags EnumSet<Flag>
doctor-service :8083

Provider registry. Manages clinical credentials, specialties, and real-time availability via gRPC server stubs.

providerId UUID
licenseType String
availability ScheduleState
appointment-service :8084

Booking orchestration. Validates multi-domain constraints via synchronous gRPC before committing to the local schema.

appointmentId UUID
patientId UUID
serviceDate LocalDateTime
admission-service :8086

Inpatient lifecycle. Uses the Transactional Outbox pattern to bridge clinical state changes to downstream financial events.

admissionId UUID
bedId UUID
status AdmitStatus
notification-service :8090

Dispatcher. Uses Hybrid Enrichment (Redis L1 / gRPC L2) to resolve PII from sanitized Kafka event payloads.

cacheKey user::contact::{id}
hydrator PatientGrpcStub
support-service :8085

Support domain. Merges Lab and Inventory operations; publishes transactional state updates via reliable Outbox Relay.

opCode String (SKU/Lab)
status ServiceStatus
Architectural Paradigms

The system is organized into three distinct operational layers to ensure scalability and fault isolation:

  • Layer 1: Edge & Security (Ingress) — Reactive API Gateway enforcing stateless JWT authentication and path rewriting.
  • Layer 2: Synchronous Clinical Core (gRPC) — Domain services (Appointment/Admission) performing authoritative validations via zero-allocation Protobuf stubs.
  • Layer 3: Asynchronous Event Plane (Kafka) — Decoupled side effects (Billing/Notifications) and state synchronization through reliable Outbox patterns.
03

Request Flow — Authentication

Auth endpoints (/api/auth/**) are whitelisted at the gateway — no JWT filter runs. The gateway strips the /api prefix and forwards to auth-service on port 8089. The auth-service itself has its own Spring Security config that permits /auth/** without a session.

POST /api/auth/login
──▶
Gateway
path matches /api/auth/** → skip JWT filter
StripPrefix=1 + SetRequestHeader Content-Type
──▶
auth-service :8089
findByName() + BCrypt.matches()
──▶
PostgreSQL
Jwts.builder() HS256, exp: 86400s
──▶
{ token: "..." }
JWT Verification at Gateway

For all non-auth routes, the gateway's jwtAuthenticationFilter (WebFilter) extracts the Bearer token, parses claims using Keys.hmacShaKeyFor(APP_SECRET), extracts roles into SimpleGrantedAuthority objects, and writes the authentication into ReactiveSecurityContextHolder. Clock skew tolerance is set to 5 minutes. Downstream services receive the request without further auth checks.

04

Core Logic — Hybrid Enrichment

The system uses a "Cache-First" enrichment strategy to handle PII-sanitized events from the Kafka stream. This ensures high performance without sacrificing data authority.

01
Kafka Consumption — sanitized payload

A consumer (e.g., Notification or Billing) receives an event containing surrogate UUIDs (e.g., patientId). No PII is carried on the wire.

02
L1 Lookup — Redis hydration

The service attempts to resolve the PII from the local L1 Redis cache using the UUID as a key. P95 latency: < 5ms.

03
L2 Fallback — gRPC authoritative fetch

On cache miss, the service invokes a gRPC BlockingStub against the patient-service master (Port 9090). This is the authoritative truth.

04
Cache Write-Back

The retrieved PII is written back to Redis with a TTL of 3600s to satisfy future requests for the same entity.

05

Core Logic — Admission & Financials

State synchronization between Clinical Admission and Financial Billing is guaranteed via the Transactional Outbox pattern, ensuring at-least-once delivery semantics for discharge events.

POST /discharge
──▶
Admission-Service
Atomic Transaction Start
Update Status + Insert Outbox
──▶
PostgreSQL (Admission)
Outbox Relay (CDC/Polling)
──▶
Kafka: patient-discharged
Idempotent Consumer
──▶
Billing-Service
06

Request Flow — Appointment Journey

Appointment creation represents the primary synchronous write path, enforcing strict availability constraints via cross-service gRPC validation before persistence.

01
Inbound Payload — API Gateway ingress

Gateway validates the JWT and routes to appointment-service. The AppointmentController maps the request to a domain entity.

02
Authoritative Validation — gRPC sequence

The service invokes PatientQueryBlockingStub (Port 9090) to verify clinical existence and DoctorQueryBlockingStub to verify provider availability in real-time.

03
Resilience Layer — Circuit Breaker

All gRPC calls are wrapped in Resilience4j instances. If the patient-service latency exceeds 200ms, the circuit opens to protect the calling thread pool.

04
Atomic Persistence — Transactional commit

The appointment state and a corresponding AppointmentScheduled event are committed atomically to the local PostgreSQL schema using JDBC transactions.

07

Request Flow — Automated Invoicing

The billing service is purely event-driven. It listens on clinical and financial topics to generate PDF records.

01
Consume — deserialize payload
07

Request Flow — Automated Invoicing

The billing service operates as a downstream consumer of clinical events. It ensures financial integrity through idempotent event processing and late-bound PII enrichment.

01
Ingress — Idempotency check

KafkaListener validates the eventId against the processed_events log to prevent duplicate billing on at-least-once Kafka delivery.

02
Hydration — gRPC fetch

Since the event is PII-sanitized, the service invokes the PatientQueryBlockingStub to retrieve the patient's billing address and full name.

03
Document Generation — PDF orchestration

The InvoiceService generates the ledger entry and invokes an external REST adapter to produce the standardized clinical invoice PDF.

04
Persistence — Ledger commit

The financial record is finalized in the billing_schema and the PDF metadata is linked for audit retrieval.

06

Request Flow — API Gateway Routing

The gateway is built on Spring Cloud Gateway (WebFlux / Project Reactor). All route definitions live in application.yml. There is no service discovery — routes are statically configured to Docker Compose service names.

Inbound Path Upstream Filters
/api/patients/** patient-service:8080 StripPrefix=1
/api/doctors/** doctor-service:8083 StripPrefix=1
/api/appointments/** appointment-service:8084 StripPrefix=1
/api/auth/** auth-service:8089 StripPrefix=1
/api/support/** support-service:8085 StripPrefix=1
/api/admission/** admission-service:8086 StripPrefix=1
/api/notification/** notification-service:8082 StripPrefix=1
/api-docs/** Swagger UI aggregation
WebFlux Security Note

The gateway uses @EnableWebFluxSecurity, not standard MVC security. ReactiveUserDetailsService is overridden with a no-op bean to suppress Spring Security's autoconfigured form login. Auth is handled entirely by the custom jwtAuthenticationFilter WebFilter.

07

Fault Tolerance — Resilience4j

Circuit breakers guard all outbound REST calls from appointment-service to patient-service and doctor-service. Configuration is identical for both:

# appointment-service / application.properties

resilience4j.circuitbreaker.instances.patientService
  sliding-window-size                    = 10
  failure-rate-threshold                 = 50      # % failures to open
  wait-duration-in-open-state            = 10s
  permitted-calls-in-half-open-state     = 3
  minimum-number-of-calls               = 5

resilience4j.retry.instances.patientService
  max-attempts                           = 3
  wait-duration                          = 500ms

State transitions: CLOSED → (failure rate exceeds threshold) → OPEN → (wait 10s) → HALF_OPEN → (3 probe calls pass) → CLOSED. In OPEN state, all calls immediately hit the fallback method which returns false, causing the appointment creation to fail with CustomNotFoundException rather than timing out.

patient-service also configures a circuit breaker around its Kafka producer to prevent Kafka broker unavailability from blocking HTTP request threads.

08

Observability

Every service exposes a /actuator/prometheus endpoint via Micrometer. Prometheus scrapes all 6 services on a 5-second interval. Grafana reads from Prometheus and serves a pre-provisioned dashboard.

Metrics tracked per service

MetricSourceUse
http_server_requests_secondsMicrometerRequest rate, P95 latency, error rate by status code
jvm_memory_used_bytesJVMHeap / non-heap usage per service
hikaricp_connections_*HikariCPActive, idle, pending, max pool connections
resilience4j_circuitbreaker_*Resilience4jState (CLOSED/OPEN/HALF_OPEN), failure rate, call rate
jvm_gc_pause_secondsJVM / G1GCGC pause rate and duration
tomcat_threads_*TomcatCurrent and busy thread counts

Load testing

Three k6 scripts cover different load profiles: low-stress.js (10 VUs, 30s), medium-stress.js (ramp to 50 VUs), intense-stress.js (ramp to 200 VUs). appointments-stress.js runs a full setup phase — registers a user, creates 50 patients and 50 doctors, then hammers appointment creation at 100 req/s constant arrival rate for 30 seconds. Thresholds: P95 < 2000ms, error rate < 1%.

Test Coverage

patient-service achieves 72% instruction coverage measured by JaCoCo. Coverage spans controller (@WebMvcTest), service (Mockito unit tests), repository (@DataJpaTest with H2), and Kafka producer. DTOs, entity classes, config, exception handlers, and generated Protobuf classes are excluded from the report.

09

CI / CD

Each service has its own GitHub Actions workflow triggered on push to master when files under its directory change. This prevents unrelated service rebuilds.

# Per-service workflow pattern
on:
  push:
    branches: [master]
    paths:
      - 'appointment-service/**'

# Kubernetes workflow uses matrix strategy across 6 services
strategy:
  matrix:
    service:
      - { name: api-gateway,          port: 4004 }
      - { name: appointment-service,   port: 8084 }
      - ...

The kubernetes-register.yml workflow builds multi-platform images (linux/amd64,linux/arm64) using Docker Buildx, pushes to GHCR, runs Trivy vulnerability scanning, and updates image tags in Kubernetes manifests. Dockerfile uses a two-stage build: Maven builder image + slim eclipse-temurin:21-jdk runner. JVM flags: -XX:MaxRAMPercentage=75.0 -XX:+UseG1GC -XX:+ExitOnOutOfMemoryError.

10

Observability & Resilience

The system is designed for high observability and fault isolation. Each microservice follows a standard set of cross-cutting paradigms to ensure operational reliability.

ConcernImplementation StrategyTechnical Detail
Distributed Tracing Micrometer Tracing + OTel 100% sampling rate; traceId and spanId injected into MDC for log correlation across service boundaries.
Aggregated Logging Kafka Appender + Logstash Services emit JSON logs to his-audit-logs topic. Logstash processes and persists to MongoDB for audit archiving.
Fault Tolerance Resilience4j Circuit Breaker Synchronous gRPC/REST calls use sliding-window failure thresholds (50%) to prevent cascading failures.
Message Reliability Dead Letter Queues (DLQ) Kafka consumers implement FixedBackOff (3 retries). Failed records are routed to {topic}.DLQ for manual intervention.
Health Monitoring Liveness/Readiness Probes Exposed via /actuator/health. Gateway performs periodic health checks before routing traffic to upstreams.
11

API Reference & Aggregation

The system provides a unified developer portal. The API Gateway aggregates Swagger/OpenAPI documentation from all downstream services, exposing them through a single ingress point.

Unified Doc Endpoints

Documentation is accessible via /aggregate/{service-name}/v3/api-docs. The Gateway portal (Port 4004) serves as the authority for the full clinical API surface.