Liking cljdoc? Tell your friends :D

Sizing & Scaling

Boundary’s architecture (FC/IS + hexagonal ports) is meant to let you scale vertically, horizontally, or a mix — mostly by configuration rather than rewrites. This page is an honest map of how far that holds today: which knobs exist, which components are already safe to run as many replicas, and which still hold in-process state that you must account for.

Three axes

Axis	Meaning
Vertical	One process, more resources. Bigger heap, larger connection/thread pools, more CPU. Pure configuration in Boundary.
Horizontal	Many processes (replicas) behind a load balancer, sharing backing services (Postgres, Redis). Requires that no request-handling state lives in a single process.
Functional decomposition	Slice modules into separate deployables (microservices-style) that scale independently. A cross-module call that was in-process becomes a network call. Highest leverage, highest effort.

Axis

Meaning

Vertical

One process, more resources. Bigger heap, larger connection/thread pools, more CPU. Pure configuration in Boundary.

Horizontal

Many processes (replicas) behind a load balancer, sharing backing services (Postgres, Redis). Requires that no request-handling state lives in a single process.

Functional decomposition

Slice modules into separate deployables (microservices-style) that scale independently. A cross-module call that was in-process becomes a network call. Highest leverage, highest effort.

Vertical scaling — by configuration today

All sizing knobs live in resources/conf/{dev,test,prod,acc}/config.edn plus environment variables. Change the value, restart the process. No code.

Knob Where Notes

Knob	Where	Notes
DB connection pool	`:boundary/postgresql :pool` — `minimum-idle`, `maximum-pool-size`, `connection-timeout-ms`, `idle-timeout-ms`, `max-lifetime-ms`, `keepalive-time-ms`, `leak-detection-threshold-ms`	HikariCP. prod default `min 10 / max 50`, dev `2 / 10`, test (H2) `1 / 5`. `DB_POOL_SIZE` env override.
HTTP server	`:boundary/http` — `:port`, `:host`, `:port-range`	Jetty. `HTTP_PORT` / `HTTP_HOST` env. Jetty manages its own thread pool.
JVM heap / GC	`JAVA_OPTS` (see `docker-compose.yml`)	Default `-Xmx512m -Xms128m -XX:+UseG1GC -XX:+UseContainerSupport`. Raise `-Xmx` for vertical scale.
Cache (Redis) pool	`:boundary/cache` — `:max-total`, `:max-idle`, `:min-idle`, `:timeout`, `:default-ttl`	Jedis pool. prod default `max-total 50 / max-idle 20 / min-idle 5`.

DB connection pool

:boundary/postgresql :pool — minimum-idle, maximum-pool-size, connection-timeout-ms, idle-timeout-ms, max-lifetime-ms, keepalive-time-ms, leak-detection-threshold-ms

HikariCP. prod default min 10 / max 50, dev 2 / 10, test (H2) 1 / 5. DB_POOL_SIZE env override.

HTTP server

:boundary/http — :port, :host, :port-range

Jetty. HTTP_PORT / HTTP_HOST env. Jetty manages its own thread pool.

JVM heap / GC

JAVA_OPTS (see docker-compose.yml)

Default -Xmx512m -Xms128m -XX:+UseG1GC -XX:+UseContainerSupport. Raise -Xmx for vertical scale.

Cache (Redis) pool

:boundary/cache — :max-total, :max-idle, :min-idle, :timeout, :default-ttl

Jedis pool. prod default max-total 50 / max-idle 20 / min-idle 5.

Watch the multiplication: with N replicas each holding a pool of maximum-pool-size, total Postgres connections = N × max. Keep N × max under the server’s max_connections.

How the architecture enables horizontal scaling

Cross-module calls go through protocols defined in each module’s ports.clj (enforced by bb check:ports). Core logic depends on a protocol, never on a concrete adapter. That seam is the scaling lever: swap an in-process adapter for a distributed one — Redis, a queue, a remote service — without touching the functional core.

Two libraries already ship both adapters and pick between them in config:

;; libs/cache  — :in-memory (dev/test) | :redis (multi-instance)
;; libs/jobs   — :in-memory (dev/test) | :redis (shared queue across workers)
:boundary/cache {:provider :redis ...}

This is the template every other seam follows: the protocol is the contract, the distributed adapter is "just configuration" once it exists.

Horizontal readiness matrix

Component N replicas Detail

Component	N replicas	Detail
Cache	✅	Redis adapter (`libs/cache/…/adapters/redis.clj`) — Nippy serialization, atomic ops. In-memory adapter is dev/test only.
Jobs	✅	Redis queue (`libs/jobs/…/adapters/redis.clj`) — priority lists, worker heartbeat, retry/backoff, dead-letter. Scheduled-job promotion is an atomic claim (Redis `ZREM` / in-memory `swap-vals!`), so a due job is moved to an execution queue exactly once across concurrent workers. The handler registry is per-process, but a dequeued job with no local handler is re-enqueued for another worker and dead-lettered only after it has gone unhandled for `:max-requeue-age-ms` (default 5 min) — no silent DLQ drop. The re-enqueue is delayed (parked in the scheduled set `:requeue-delay-ms` ahead) so a handlerless worker can’t reacquire it in a tight loop. Give-up is age-based, not attempt-based, so wrong-worker misses under load (or more handlerless workers than any attempt budget) can’t drop a job a slow handler-owning worker simply hadn’t polled yet. A worker started with an empty registry warns loudly at startup (BOU-88).
Auth / sessions	✅	DB-backed, pure core (`libs/user/…/core/session.clj`). No server-side sticky state — any replica can serve any request.
Multi-tenancy	✅	schema-per-tenant (`libs/tenant/…/provisioning.clj`). Instance-agnostic; routing is per-request.
Email / external	✅	Async via the jobs queue / stateless IO adapters.
Readiness checks	✅	`/health/ready` (`readiness-handler`, wired in `wiring.clj`) probes DB + cache and returns 503 when any is down — correct for load balancers and k8s.
Rate limiting	✅	The config-driven `http-rate-limit-protection` interceptor is wired into the default route pipeline (`wiring.clj` injects `:rate-limit` config + the `:boundary/cache` into the interceptor `system` map). Configure under `:boundary/http :rate-limit {:enabled? :limit :window-ms}` (per-env; `HTTP_RATE_LIMIT`/`HTTP_RATE_LIMIT_WINDOW_MS`). Enforcement is opt-in (default off — the bundled prod/acc configs ship it disabled; enable it together with an active Redis cache) so an upgrade cannot start 429-ing existing consumers, nor silently run a per-process limiter in production. With an active Redis cache the fixed-window limit is shared across replicas; with no cache it falls back to a per-process counter — single-node only, effective global limit = limit × N (the wiring logs a warning at startup in that case). The in-process fallback is heap-bounded by a hard cap — before a new client is recorded at the cap, stale clients are swept and, if the map is still full, the least-recently-active client is evicted — so high-cardinality client ids can’t leak memory on a long-running node even when every client is in-window. BOU-87.
Graceful shutdown	✅	Integrant `halt!` runs on a JVM shutdown hook (`src/boundary/main.clj`), closing pools and stopping Jetty. Jetty is configured for graceful connection draining (`configure-graceful-shutdown!` in `wiring.clj`): on stop it stops accepting new connections, rejects new requests with 503, and lets in-flight requests finish within `:boundary/http :drain-timeout-ms` (default 30000 ms in prod/acc, env `HTTP_DRAIN_TIMEOUT_MS`; `0` disables). Set the window above the load balancer’s deregistration delay for zero-downtime rollouts (BOU-86).
Realtime / WebSocket	✅ *	Replica-safe via `:provider :redis` (BOU-85, ADR-035). Live sockets remain node-local; routing envelopes fan out over a Redis pub/sub channel so a broadcast reaches clients on any replica. Topic subscriptions are stored in Redis sets and are cluster-wide. *The default `:in-memory` provider is single-node only — use `:redis` for multi-replica deployments.

Cache

✅

Redis adapter (libs/cache/…/adapters/redis.clj) — Nippy serialization, atomic ops. In-memory adapter is dev/test only.

Jobs

✅

Redis queue (libs/jobs/…/adapters/redis.clj) — priority lists, worker heartbeat, retry/backoff, dead-letter. Scheduled-job promotion is an atomic claim (Redis ZREM / in-memory swap-vals!), so a due job is moved to an execution queue exactly once across concurrent workers. The handler registry is per-process, but a dequeued job with no local handler is re-enqueued for another worker and dead-lettered only after it has gone unhandled for :max-requeue-age-ms (default 5 min) — no silent DLQ drop. The re-enqueue is delayed (parked in the scheduled set :requeue-delay-ms ahead) so a handlerless worker can’t reacquire it in a tight loop. Give-up is age-based, not attempt-based, so wrong-worker misses under load (or more handlerless workers than any attempt budget) can’t drop a job a slow handler-owning worker simply hadn’t polled yet. A worker started with an empty registry warns loudly at startup (BOU-88).

Auth / sessions

✅

DB-backed, pure core (libs/user/…/core/session.clj). No server-side sticky state — any replica can serve any request.

Multi-tenancy

✅

schema-per-tenant (libs/tenant/…/provisioning.clj). Instance-agnostic; routing is per-request.

Email / external

✅

Async via the jobs queue / stateless IO adapters.

Readiness checks

✅

/health/ready (readiness-handler, wired in wiring.clj) probes DB + cache and returns 503 when any is down — correct for load balancers and k8s.

Rate limiting

✅

The config-driven http-rate-limit-protection interceptor is wired into the default route pipeline (wiring.clj injects :rate-limit config + the :boundary/cache into the interceptor system map). Configure under :boundary/http :rate-limit {:enabled? :limit :window-ms} (per-env; HTTP_RATE_LIMIT/HTTP_RATE_LIMIT_WINDOW_MS). Enforcement is opt-in (default off — the bundled prod/acc configs ship it disabled; enable it together with an active Redis cache) so an upgrade cannot start 429-ing existing consumers, nor silently run a per-process limiter in production. With an active Redis cache the fixed-window limit is shared across replicas; with no cache it falls back to a per-process counter — single-node only, effective global limit = limit × N (the wiring logs a warning at startup in that case). The in-process fallback is heap-bounded by a hard cap — before a new client is recorded at the cap, stale clients are swept and, if the map is still full, the least-recently-active client is evicted — so high-cardinality client ids can’t leak memory on a long-running node even when every client is in-window. BOU-87.

Graceful shutdown

✅

Integrant halt! runs on a JVM shutdown hook (src/boundary/main.clj), closing pools and stopping Jetty. Jetty is configured for graceful connection draining (configure-graceful-shutdown! in wiring.clj): on stop it stops accepting new connections, rejects new requests with 503, and lets in-flight requests finish within :boundary/http :drain-timeout-ms (default 30000 ms in prod/acc, env HTTP_DRAIN_TIMEOUT_MS; 0 disables). Set the window above the load balancer’s deregistration delay for zero-downtime rollouts (BOU-86).

Realtime / WebSocket

✅ *

Replica-safe via :provider :redis (BOU-85, ADR-035). Live sockets remain node-local; routing envelopes fan out over a Redis pub/sub channel so a broadcast reaches clients on any replica. Topic subscriptions are stored in Redis sets and are cluster-wide. *The default :in-memory provider is single-node only — use :redis for multi-replica deployments.

Topologies

Shape When

Shape	When
Single fat node	Vertical only. One process, large heap and pools. Everything works, including realtime and in-memory rate limiting. Simplest; capped by one machine.
N stateless web replicas	The main horizontal mode. N copies of the uberjar behind a load balancer, sharing Postgres + Redis. Cache, jobs, auth, tenancy all scale. Caveats: wire Redis rate limiting; WebSocket scales horizontally via `:provider :redis` (sticky sessions are only required when running the `:in-memory` provider).
Web / worker split (future)	Run dedicated job-worker processes separate from web. The jobs Redis queue already supports it, but there is currently no `worker` launch mode — `boundary.main` exposes only `server` and `cli`. Adding a worker entrypoint is the enabling step.

Single fat node

Vertical only. One process, large heap and pools. Everything works, including realtime and in-memory rate limiting. Simplest; capped by one machine.

N stateless web replicas

The main horizontal mode. N copies of the uberjar behind a load balancer, sharing Postgres + Redis. Cache, jobs, auth, tenancy all scale. Caveats: wire Redis rate limiting; WebSocket scales horizontally via :provider :redis (sticky sessions are only required when running the :in-memory provider).

Web / worker split (future)

Run dedicated job-worker processes separate from web. The jobs Redis queue already supports it, but there is currently no worker launch mode — boundary.main exposes only server and cli. Adding a worker entrypoint is the enabling step.

Production checklist

Use the Redis cache and jobs adapters, never :in-memory, for more than one replica.
Register all job handlers on every instance. A dequeued job with no local handler is re-enqueued (bounded by :max-requeues) so another instance can run it, and is dead-lettered only if no instance handles it — but a job-type that no worker registers still burns its requeue budget before failing, so keep handler sets consistent.
Enable rate limiting (:boundary/http :rate-limit :enabled? true) with an active Redis cache for a global limit across replicas; without a cache each instance counts independently (limit × N).
Keep replicas × maximum-pool-size under Postgres max_connections.
Confirm your load balancer points health probes at /health/ready (503-aware), not /health/live.
For WebSocket: use :provider :redis on :boundary/realtime to scale across replicas (ADR-035). Sticky sessions / single-node are only required with the default :in-memory provider.
Add Redis (and, for testing N replicas, a load balancer) to your deployment — docker-compose.yml currently defines a single app service only.

Functional decomposition (slicing services out)

The third axis: run a module (or a few) as its own process, scaled and deployed independently of the rest. This is where the ports.clj seam pays off most — and where the most net-new infrastructure is needed. It is not free "by config" today, but the architecture is positioned for it.

What already enables it

Asset How it helps

Asset	How it helps
Per-module activation	Modules are gated by `:enabled?` / `:active` in `config.edn`. A process can boot a subset — the http-handler concats only present routes (`(or routes [])` in `wiring.clj`), so a "user-only" process is a config, not a fork.
The protocol seam	Consumers depend on the protocol (e.g. `IUserService`), never the concrete record. Swapping an in-process record for a remote HTTP client implementing the same protocol leaves the caller untouched.
Wire format ready	Muuntaja (JSON / EDN / Transit) is already in the HTTP stack (`reitit_router.clj`); Malli `schema.clj` per module gives ready contracts.
Remote-adapter template	`libs/external` (SMTP, Twilio) is a gold-standard outbound adapter: record + `extend-protocol` + `clj-http` + error envelope + logging. Copy it for a service client.
Clean data boundaries	`bb check:ports` already forbids one module’s shell from touching another’s `shell.persistence`/`shell.service` — the only cross-module path is the service port. No cross-module SQL joins to untangle.
Context plumbing	`correlation-id`, tenant, and auth already flow through the interceptor pipeline and can ride request headers across a network hop.

Per-module activation

Modules are gated by :enabled? / :active in config.edn. A process can boot a subset — the http-handler concats only present routes ((or routes []) in wiring.clj), so a "user-only" process is a config, not a fork.

The protocol seam

Consumers depend on the protocol (e.g. IUserService), never the concrete record. Swapping an in-process record for a remote HTTP client implementing the same protocol leaves the caller untouched.

Wire format ready

Muuntaja (JSON / EDN / Transit) is already in the HTTP stack (reitit_router.clj); Malli schema.clj per module gives ready contracts.

Remote-adapter template

libs/external (SMTP, Twilio) is a gold-standard outbound adapter: record + extend-protocol + clj-http + error envelope + logging. Copy it for a service client.

Clean data boundaries

bb check:ports already forbids one module’s shell from touching another’s shell.persistence/shell.service — the only cross-module path is the service port. No cross-module SQL joins to untangle.

Context plumbing

correlation-id, tenant, and auth already flow through the interceptor pipeline and can ride request headers across a network hop.

What must be built

Generic remote-port adapter — a clj-http client that implements a module’s protocol, serializes via Malli, propagates correlation-id / tenant / auth, unwraps errors. None exists yet; all cross-module calls are in-process.
Network resilience — timeouts, retries, circuit breaker, service discovery (hardcoded URLs for MVP). External adapters use :throw-exceptions false but no retry/breaker.
Break the allowlisted dependency cycles — check_deps.clj allows admin↔user, platform↔{user,tenant,admin,workflow,search}. A cycle means two modules can’t be cleanly separated; these must be broken (e.g. extract the shared auth check behind a port) before slicing.
Async option — IEventBus is defined in libs/user/ports.clj but has no implementation. Event-driven decoupling needs a real adapter (Redis Streams / Kafka / RabbitMQ).
Data ownership decision — schema-per-tenant assumes co-located modules in one Postgres. Across services either share the DB (pragmatic) or give each service its own; there are no distributed transactions, so split writes become eventual-consistency.
Service launch mode — boundary.main exposes only server and cli; slicing needs an entrypoint that boots a named module subset as a service.

Sliceability by module

Module Effort Why

Module	Effort	Why
payments	Easy	Zero internal Boundary deps (only Maven). Already a self-contained provider. The natural pilot for the remote-adapter pattern.
core, observability	Easy	Leaf / infra; no sibling deps. (Usually shared libs, not standalone services.)
user, tenant, external	With work	Depend on platform + the in-process service assumption in middleware. Need the remote adapter + cycle-breaking (user↔admin, tenant↔platform).
admin, search, workflow	Entangled	`search`/`workflow` depend on admin’s schema provider; `admin↔user is circular. Extract shared schema/auth behind ports first.

payments

Easy

Zero internal Boundary deps (only Maven). Already a self-contained provider. The natural pilot for the remote-adapter pattern.

core, observability

Easy

Leaf / infra; no sibling deps. (Usually shared libs, not standalone services.)

user, tenant, external

With work

Depend on platform + the in-process service assumption in middleware. Need the remote adapter + cycle-breaking (user↔admin, tenant↔platform).

admin, search, workflow

Entangled

search/workflow depend on admin’s schema provider; `admin↔user is circular. Extract shared schema/auth behind ports first.

Recommended path: build the generic remote-port adapter once, prove it by extracting payments as a standalone service, then tackle user. Don’t attempt admin/search/workflow until the cycles are broken.

Known gaps & roadmap

The architecture delivers the promise; these are the concrete pieces that make "scale by configuration" fully true. Tracked under the BOU-84 spike:

✅ Realtime Redis pub/sub adapter — shipped in BOU-85 (ADR-035). WebSocket is now replica-safe via :provider :redis on :boundary/realtime.
✅ Graceful connection draining — shipped in BOU-86. Configurable shutdown grace (:boundary/http :drain-timeout-ms) lets rollouts finish in-flight requests; Jetty GracefulHandler + setStopTimeout wired in wiring.clj.
✅ Default rate-limit wiring — shipped in BOU-87. Config-driven http-rate-limit-protection is in the default pipeline; enable via :boundary/http :rate-limit and it uses the Redis cache for a cross-replica limit (per-process fallback documented).
✅ Jobs hardening — shipped in BOU-88. Missing-handler jobs are re-enqueued (bounded) instead of silently dead-lettered, an empty-registry worker warns at startup, and scheduled-job promotion is an atomic claim (ZREM / swap-vals!) so a due job runs exactly once across workers.
Deploy topology reference — compose + k8s example with N replicas, Redis, a load balancer, instance-id, and a web/worker split.

For functional decomposition (the bigger bet):

Generic remote-port adapter + RPC envelope — clj-http client implementing a module protocol, Malli (de)serialization, context propagation, error unwrap, retry/circuit-breaker. Pilot by extracting payments.
Service launch mode — boundary.main entrypoint that boots a named module subset as an independent service.
Break allowlisted dependency cycles — admin↔user, platform↔{user,tenant,admin,workflow,search} — prerequisite for slicing those modules.
IEventBus implementation — Redis Streams / Kafka adapter for async, event-driven inter-service decoupling (port already defined, unimplemented).

❮Monorepo Structure Testing Strategy❯

Can you improve this documentation?Edit on GitHub

cljdoc builds & hosts documentation for Clojure/Script libraries

Keyboard shortcuts

`Ctrl`+`k`	Jump to recent docs
`←`	Move to previous article
`→`	Move to next article
`Ctrl`+`/`	Jump to the search field

Raise an issue Browse cljdoc source Chat on Slack

× close