Version: 1.0.0 Last Updated: 2026-01-03 Audience: DevOps, SRE, Operations Engineers
Boundary applications follow the Functional Core / Imperative Shell (FC/IS) pattern:
┌─────────────────────────────────────────┐
│ HTTP Layer (Reitit) │
│ - Routes, Handlers, Middleware │
└─────────────────────────────────────────┘
↓
┌─────────────────────────────────────────┐
│ Interceptors & Security │
│ - Auth, Rate Limiting, Observability │
└─────────────────────────────────────────┘
↓
┌─────────────────────────────────────────┐
│ Service Layer (Shell) │
│ - Transaction Management, I/O │
└─────────────────────────────────────────┘
↓
┌─────────────────────────────────────────┐
│ Business Logic (Core) │
│ - Pure Functions, No Side Effects │
└─────────────────────────────────────────┘
↓
┌─────────────────────────────────────────┐
│ Adapters (Database, External) │
│ - PostgreSQL, SQLite, MySQL, H2 │
└─────────────────────────────────────────┘
| Component | Purpose | Critical? |
|---|---|---|
| HTTP Server (Jetty) | Handles incoming requests | Yes |
| Database (PostgreSQL) | Data persistence | Yes |
| Logging (Datadog/Console) | Observability | Yes |
| Metrics (DogStatsD) | Performance monitoring | No |
| Error Reporting (Sentry) | Exception tracking | No |
Production Minimum:
Database:
Required:
# Application
ENV=production # Environment: dev, test, prod
PORT=3000 # HTTP server port
# Database
DATABASE_URL=postgresql://user:pass@host:5432/dbname
DATABASE_POOL_SIZE=10 # Connection pool size
DATABASE_POOL_TIMEOUT=30000 # Connection timeout (ms)
# Security
JWT_SECRET=<64-char-random-hex> # CRITICAL: Use secrets manager
SESSION_SECRET=<64-char-random-hex>
Optional (Observability):
# Datadog
DATADOG_API_KEY=<your-api-key>
DATADOG_SERVICE=boundary-api
DATADOG_ENVIRONMENT=production
DATADOG_STATSD_HOST=localhost
DATADOG_STATSD_PORT=8125
# Sentry
SENTRY_DSN=https://...@sentry.io/...
CRITICAL: Never commit secrets to version control!
Recommended Solutions:
Example: AWS Secrets Manager
# Store secret
aws secretsmanager create-secret \
--name boundary/production/jwt-secret \
--secret-string "$(openssl rand -hex 32)"
# Retrieve in application startup
export JWT_SECRET=$(aws secretsmanager get-secret-value \
--secret-id boundary/production/jwt-secret \
--query SecretString --output text)
# 1. Build JAR
clojure -T:build jar
# 2. Copy to server
scp target/boundary-standalone.jar prod-server:/opt/boundary/
# 3. Start service
java -Xmx1g -Xms512m \
-Denv=production \
-jar /opt/boundary/boundary-standalone.jar
Systemd Service (/etc/systemd/system/boundary.service):
[Unit]
Description=Boundary API Service
After=network.target postgresql.service
[Service]
Type=simple
User=boundary
WorkingDirectory=/opt/boundary
EnvironmentFile=/opt/boundary/config/production.env
ExecStart=/usr/bin/java -Xmx1g -Xms512m \
-Denv=production \
-jar /opt/boundary/boundary-standalone.jar
Restart=on-failure
RestartSec=10
StandardOutput=journal
StandardError=journal
[Install]
WantedBy=multi-user.target
Manage Service:
sudo systemctl start boundary
sudo systemctl enable boundary
sudo systemctl status boundary
sudo journalctl -u boundary -f # View logs
Dockerfile:
FROM clojure:temurin-17-tools-deps AS builder
WORKDIR /app
COPY deps.edn .
RUN clojure -P
COPY . .
RUN clojure -T:build jar
FROM eclipse-temurin:17-jre-alpine
RUN addgroup -S boundary && adduser -S boundary -G boundary
WORKDIR /app
COPY --from=builder /app/target/boundary-standalone.jar .
USER boundary
EXPOSE 3000
ENV JAVA_OPTS="-Xmx1g -Xms512m"
CMD ["sh", "-c", "java $JAVA_OPTS -jar boundary-standalone.jar"]
Build and Run:
# Build image
docker build -t boundary-api:latest .
# Run container
docker run -d \
--name boundary-api \
-p 3000:3000 \
-e ENV=production \
-e DATABASE_URL=postgresql://... \
-e JWT_SECRET=$JWT_SECRET \
--restart unless-stopped \
boundary-api:latest
deployment.yaml:
apiVersion: apps/v1
kind: Deployment
metadata:
name: boundary-api
namespace: production
spec:
replicas: 3
selector:
matchLabels:
app: boundary-api
template:
metadata:
labels:
app: boundary-api
spec:
containers:
- name: boundary
image: your-registry/boundary-api:v1.0.0
ports:
- containerPort: 3000
env:
- name: ENV
value: "production"
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: boundary-secrets
key: database-url
- name: JWT_SECRET
valueFrom:
secretKeyRef:
name: boundary-secrets
key: jwt-secret
resources:
requests:
memory: "512Mi"
cpu: "250m"
limits:
memory: "2Gi"
cpu: "1000m"
livenessProbe:
httpGet:
path: /health/live
port: 3000
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
httpGet:
path: /health/ready
port: 3000
initialDelaySeconds: 10
periodSeconds: 5
---
apiVersion: v1
kind: Service
metadata:
name: boundary-api
namespace: production
spec:
selector:
app: boundary-api
ports:
- port: 80
targetPort: 3000
type: ClusterIP
Available Endpoints:
GET /health - Overall system health
{
"status": "healthy",
"service": "boundary-api",
"version": "1.0.0",
"timestamp": "2026-01-03T12:00:00Z"
}
GET /health/live - Liveness probe (Kubernetes)
{"status": "alive"}
GET /health/ready - Readiness probe (Kubernetes)
{"status": "ready"}
Log Levels:
TRACE - Detailed debugging informationDEBUG - General debugging informationINFO - Informational messages (default)WARN - Warning messages (potential issues)ERROR - Error messages (failures)FATAL - Critical failuresLog Formats:
Console (Development):
2026-01-03 12:00:00 INFO [boundary.user.service] User created user-id=123 email=user@example.com
JSON (Production with Datadog):
{
"timestamp": "2026-01-03T12:00:00Z",
"level": "info",
"message": "User created",
"service": "boundary-api",
"user-id": "123",
"email": "user@example.com",
"correlation-id": "req-abc123"
}
Key Log Fields:
correlation-id - Trace requests across servicesuser-id - User performing actionoperation - Business operation typeduration-ms - Operation durationerror.kind / error.message - Exception detailsSystem Metrics:
# HTTP metrics
http.requests.count{path,method,status}
http.request.duration{path,method}
# Database metrics
database.query.count{operation}
database.query.duration{operation}
database.connection.pool.active
database.connection.pool.idle
# Application metrics
application.users.created.count
application.sessions.active
application.errors.count{type}
Setting Up Datadog Agent:
# Install agent
DD_API_KEY=<your-key> DD_SITE="datadoghq.com" bash -c "$(curl -L https://s3.amazonaws.com/dd-agent/scripts/install_script.sh)"
# Configure DogStatsD
sudo vim /etc/datadog-agent/datadog.yaml
# Set:
# dogstatsd_port: 8125
# dogstatsd_non_local_traffic: true
# Restart agent
sudo systemctl restart datadog-agent
Configuration:
export SENTRY_DSN=https://...@sentry.io/...
export SENTRY_ENVIRONMENT=production
export SENTRY_RELEASE=v1.0.0
What Gets Reported:
Critical Alerts (Page On-Call):
Warning Alerts (Slack/Email):
Location: resources/config/
Files:
dev.edn - Development environmenttest.edn - Test environmentprod.edn - Production environment (use environment variables for secrets!)Example Production Config:
{:boundary/settings
{:name "Boundary API"
:version "1.0.0"
:environment :production}
:boundary/http-server
{:port #env PORT
:host "0.0.0.0"
:join? true}
:boundary/database
{:connection-uri #env DATABASE_URL
:pool-size #long #env [DATABASE_POOL_SIZE 10]
:connection-timeout #long #env [DATABASE_POOL_TIMEOUT 30000]}
:boundary/logging
{:provider :datadog
:api-key #env DATADOG_API_KEY
:service #env DATADOG_SERVICE
:level :info}
:boundary/metrics
{:provider :datadog-statsd
:host #env [DATADOG_STATSD_HOST "localhost"]
:port #long #env [DATADOG_STATSD_PORT 8125]
:service #env DATADOG_SERVICE}
:boundary/error-reporting
{:provider :sentry
:dsn #env SENTRY_DSN
:environment #env SENTRY_ENVIRONMENT
:release #env [SENTRY_RELEASE "unknown"]}}
Priority Order (highest to lowest):
prod.edn)Check Migration Status:
clojure -M:migrate status
Run Pending Migrations:
# Dry run (safe)
clojure -M:migrate migrate --dry-run
# Execute migrations
clojure -M:migrate migrate
Rollback Last Migration:
clojure -M:migrate rollback
Create New Migration:
clojure -M:migrate create add-email-verification
# Edit generated files:
# migrations/YYYYMMDD-HHMMSS-add-email-verification.up.sql
# migrations/YYYYMMDD-HHMMSS-add-email-verification.down.sql
Migration Best Practices:
.down.sql)Formula for pool size:
pool_size = ((core_count * 2) + effective_spindle_count)
Example: 4-core server with 1 SSD:
pool_size = (4 * 2) + 1 = 9 ≈ 10 connections
Configuration:
export DATABASE_POOL_SIZE=10
export DATABASE_POOL_TIMEOUT=30000 # 30 seconds
PostgreSQL Backup (Automated):
#!/bin/bash
# /opt/boundary/scripts/backup-db.sh
BACKUP_DIR=/var/backups/boundary
TIMESTAMP=$(date +%Y%m%d_%H%M%S)
DATABASE_URL=${DATABASE_URL}
# Extract credentials from URL
DB_NAME=$(echo $DATABASE_URL | sed -n 's|.*\/\([^?]*\).*|\1|p')
# Create backup
pg_dump $DATABASE_URL | gzip > $BACKUP_DIR/boundary_${TIMESTAMP}.sql.gz
# Retain only last 30 days
find $BACKUP_DIR -name "boundary_*.sql.gz" -mtime +30 -delete
# Upload to S3 (optional)
aws s3 cp $BACKUP_DIR/boundary_${TIMESTAMP}.sql.gz \
s3://your-backup-bucket/boundary/
Cron Schedule:
# Daily at 2 AM
0 2 * * * /opt/boundary/scripts/backup-db.sh >> /var/log/boundary-backup.log 2>&1
Restore from Backup:
# 1. Stop application
sudo systemctl stop boundary
# 2. Drop and recreate database (DESTRUCTIVE!)
dropdb boundary_production
createdb boundary_production
# 3. Restore backup
gunzip -c /var/backups/boundary/boundary_20260103_020000.sql.gz | psql boundary_production
# 4. Start application
sudo systemctl start boundary
| Severity | Definition | Response Time | Examples |
|---|---|---|---|
| SEV1 | Service down, data loss | 15 minutes | Database down, app crashed |
| SEV2 | Degraded performance | 1 hour | High latency, partial outage |
| SEV3 | Minor issues | 4 hours | Slow queries, minor bugs |
| SEV4 | Cosmetic issues | Next business day | UI glitches |
Symptoms:
Response Steps:
Commands:
# Check service status
sudo systemctl status boundary
sudo journalctl -u boundary --since "10 minutes ago"
# Check database connectivity
psql $DATABASE_URL -c "SELECT 1"
# Restart service
sudo systemctl restart boundary
# Rollback deployment (K8s)
kubectl rollout undo deployment/boundary-api -n production
# Scale up (K8s)
kubectl scale deployment/boundary-api --replicas=5 -n production
Symptoms:
Response Steps:
Commands:
# Tail error logs
sudo journalctl -u boundary -p err -f
# Check resource usage
top
htop
docker stats # If using Docker
# Check database connections
psql $DATABASE_URL -c "SELECT count(*) FROM pg_stat_activity"
# Scale up (K8s)
kubectl scale deployment/boundary-api --replicas=10 -n production
First 5 Minutes:
Next 15 Minutes:
Post-Incident:
Daily:
Weekly:
Monthly:
Quarterly:
Logrotate Configuration (/etc/logrotate.d/boundary):
/var/log/boundary/*.log {
daily
rotate 30
compress
delaycompress
missingok
notifempty
create 0640 boundary boundary
sharedscripts
postrotate
systemctl reload boundary > /dev/null 2>&1 || true
endscript
}
Check for Outdated Dependencies:
clojure -M:outdated
Update Dependencies:
# 1. Review changes in deps.edn
# 2. Test in dev environment
# 3. Run full test suite
clojure -M:test
# 4. Deploy to staging
# 5. Run integration tests
# 6. Deploy to production (during low-traffic window)
Symptoms:
OutOfMemoryError exceptionsDiagnosis:
# Check JVM heap usage
jstat -gc <pid> 1000
# Generate heap dump
jmap -dump:live,format=b,file=/tmp/heap.bin <pid>
# Analyze heap dump with VisualVM or Eclipse MAT
Fixes:
-Xmx2g-XX:+UseG1GC -XX:MaxGCPauseMillis=200Symptoms:
Diagnosis:
-- Find slow queries (PostgreSQL)
SELECT
query,
calls,
total_time / calls as avg_time_ms,
total_time
FROM pg_stat_statements
ORDER BY total_time DESC
LIMIT 20;
-- Find missing indexes
SELECT
schemaname,
tablename,
attname,
n_distinct,
correlation
FROM pg_stats
WHERE schemaname = 'public'
AND n_distinct > 100
AND correlation < 0.1;
Fixes:
Symptoms:
Connection timeout errorsDiagnosis:
-- Check active connections
SELECT count(*), state
FROM pg_stat_activity
GROUP BY state;
-- Find long-running queries
SELECT pid, now() - query_start as duration, query
FROM pg_stat_activity
WHERE state != 'idle'
ORDER BY duration DESC;
Fixes:
DATABASE_POOL_SIZE=20SELECT pg_terminate_backend(<pid>);Symptoms:
Certificate expired errorsCheck Certificate Expiry:
echo | openssl s_client -connect yourdomain.com:443 2>/dev/null | \
openssl x509 -noout -dates
Renew Let's Encrypt Certificate:
sudo certbot renew
sudo systemctl reload nginx # or your reverse proxy
Configured by Default:
Content-Security-Policy - XSS protectionX-Frame-Options: DENY - Clickjacking protectionX-Content-Type-Options: nosniff - MIME sniffing protectionStrict-Transport-Security - Force HTTPSX-XSS-Protection: 1; mode=block - Legacy XSS protectionVerify Headers:
curl -I https://your-api.com/health
Default Limits:
Monitor Rate Limit Blocks:
# Check logs for rate limit violations
sudo journalctl -u boundary | grep "rate_limit_exceeded"
JWT Secret Rotation (Zero-Downtime):
Session Token Invalidation:
# Force all users to re-login
psql $DATABASE_URL -c "DELETE FROM user_sessions WHERE created_at < NOW() - INTERVAL '1 hour'"
Check Suspicious Activity:
-- Failed login attempts
SELECT email, COUNT(*) as attempts, MAX(created_at) as last_attempt
FROM audit_events
WHERE event_type = 'login_failed'
AND created_at > NOW() - INTERVAL '1 hour'
GROUP BY email
HAVING COUNT(*) > 10;
-- Admin actions
SELECT actor, resource, action, result, created_at
FROM audit_events
WHERE actor_role = 'admin'
AND created_at > NOW() - INTERVAL '24 hours'
ORDER BY created_at DESC;
Recommended JVM Flags:
java \
-Xmx2g \ # Max heap size
-Xms512m \ # Initial heap size
-XX:+UseG1GC \ # G1 garbage collector
-XX:MaxGCPauseMillis=200 \ # Target GC pause time
-XX:+HeapDumpOnOutOfMemoryError \ # Dump heap on OOM
-XX:HeapDumpPath=/tmp/heap.bin \ # Heap dump location
-Denv=production \
-jar boundary-standalone.jar
PostgreSQL Settings (postgresql.conf):
# Connections
max_connections = 200
# Memory
shared_buffers = 1GB # 25% of system RAM
effective_cache_size = 3GB # 75% of system RAM
work_mem = 16MB # Per operation memory
maintenance_work_mem = 256MB
# Query Planner
random_page_cost = 1.1 # For SSDs (default: 4.0)
effective_io_concurrency = 200 # For SSDs (default: 1)
# Logging
log_min_duration_statement = 1000 # Log queries >1 second
log_line_prefix = '%t [%p]: '
HTTP Caching:
;; Add cache headers to static resources
{:status 200
:headers {"Cache-Control" "public, max-age=31536000, immutable"}
:body static-resource}
Application-Level Caching (Future):
Run Load Test with Apache Bench:
# 1000 requests, 10 concurrent
ab -n 1000 -c 10 http://localhost:3000/health
# With authentication
ab -n 1000 -c 10 -H "Authorization: Bearer <token>" \
http://localhost:3000/api/v1/users
Run Load Test with k6:
// load-test.js
import http from 'k6/http';
import { check } from 'k6';
export let options = {
vus: 50, // 50 virtual users
duration: '5m', // 5 minutes
};
export default function() {
let res = http.get('http://localhost:3000/health');
check(res, {
'status is 200': (r) => r.status === 200,
'response time < 200ms': (r) => r.timings.duration < 200,
});
}
k6 run load-test.js
Performance Targets:
# Application
sudo systemctl start boundary
sudo systemctl stop boundary
sudo systemctl restart boundary
sudo systemctl status boundary
sudo journalctl -u boundary -f
# Database
psql $DATABASE_URL
clojure -M:migrate status
clojure -M:migrate migrate
# Monitoring
curl http://localhost:3000/health
curl http://localhost:3000/health/ready
curl http://localhost:3000/health/live
# Docker
docker logs -f boundary-api
docker exec -it boundary-api /bin/sh
docker stats boundary-api
# Kubernetes
kubectl get pods -n production
kubectl logs -f deployment/boundary-api -n production
kubectl describe pod <pod-name> -n production
kubectl exec -it <pod-name> -n production -- /bin/sh
Emergency Contacts:
Support Channels:
Document Version: 1.0.0 Last Review Date: 2026-01-03 Next Review Date: 2026-04-03 (Quarterly)
Can you improve this documentation?Edit on GitHub
cljdoc builds & hosts documentation for Clojure/Script libraries
| Ctrl+k | Jump to recent docs |
| ← | Move to previous article |
| → | Move to next article |
| Ctrl+/ | Jump to the search field |