A high-performance eBPF-based Layer 4 load balancer written in Clojure. Uses XDP (eXpress Data Path) for ingress DNAT and TC (Traffic Control) for egress SNAT, providing kernel-level packet processing for efficient traffic distribution. Can also be used as a simple reverse proxy.
bpftool (optional, for debugging)iproute2 (for TC qdisc management)Install Java 25:
# Ubuntu 24.04+
sudo apt-get install openjdk-25-jdk
# Or download from https://jdk.java.net/25/
Install Clojure CLI:
curl -L -O https://github.com/clojure/brew-install/releases/latest/download/linux-install.sh
chmod +x linux-install.sh
sudo ./linux-install.sh
Clone the repository:
git clone https://github.com/pgdad/clj-ebpf-lb.git
cd clj-ebpf-lb
Create a configuration file (proxy.edn):
{:proxies
[{:name "web"
:listen {:interfaces ["eth0"] :port 80}
:default-target {:ip "10.0.0.1" :port 8080}}]
:settings
{:stats-enabled false
:connection-timeout-sec 300}}
Run the load balancer:
sudo clojure -M:run -c lb.edn
Usage: clj-ebpf-lb [options]
Options:
-c, --config FILE Configuration file path (default: config.edn)
-i, --interface IFACE Network interface to attach to (can specify multiple)
-p, --port PORT Listen port (default: 80)
-t, --target TARGET Default target as ip:port (default: 127.0.0.1:8080)
-s, --stats Enable statistics collection
-v, --verbose Verbose output
-h, --help Show help
Examples:
clj-ebpf-lb -c lb.edn
clj-ebpf-lb -i eth0 -p 80 -t 10.0.0.1:8080
clj-ebpf-lb -i eth0 -i eth1 -p 443 -t 10.0.0.2:8443 --stats
The proxy is configured using an EDN (Extensible Data Notation) file:
{:proxies
[;; Each proxy configuration
{:name "proxy-name" ; Unique identifier
:listen
{:interfaces ["eth0" "eth1"] ; Network interfaces to listen on
:port 80} ; Port to listen on
:default-target
{:ip "10.0.0.1" ; Default backend IP
:port 8080} ; Default backend port
:source-routes ; Optional: source-based routing rules
[{:source "192.168.1.0/24" ; Source IP or CIDR
:target {:ip "10.0.0.2" :port 8080}}
{:source "10.10.0.0/16"
:target {:ip "10.0.0.3" :port 8080}}]}]
:settings
{:stats-enabled false ; Enable real-time statistics
:connection-timeout-sec 300 ; Connection idle timeout
:max-connections 100000}} ; Maximum tracked connections
Route traffic to different backends based on the client's source IP:
:source-routes
[;; Route internal network to internal backend
{:source "192.168.0.0/16"
:target {:ip "10.0.0.10" :port 8080}}
;; Route specific host to dedicated backend
{:source "192.168.1.100"
:target {:ip "10.0.0.20" :port 8080}}
;; Route cloud VPC to cloud backend
{:source "10.0.0.0/8"
:target {:ip "10.0.0.30" :port 8080}}]
The routing precedence is:
Route TLS traffic to different backends based on the Server Name Indication (SNI) hostname in the TLS ClientHello. This enables multi-tenant HTTPS load balancing without terminating TLS (layer 4 passthrough with layer 7 inspection).
{:proxies
[{:name "https-gateway"
:listen {:interfaces ["eth0"] :port 443}
:default-target {:ip "10.0.0.1" :port 8443}
:sni-routes
[{:sni-hostname "api.example.com"
:target {:ip "10.0.1.1" :port 8443}}
{:sni-hostname "web.example.com"
:target {:ip "10.0.2.1" :port 8443}}
{:sni-hostname "app.example.com"
:targets [{:ip "10.0.3.1" :port 8443 :weight 70}
{:ip "10.0.3.2" :port 8443 :weight 30}]}]}]}
How it works:
SNI Routing Features:
API.Example.COM matches api.example.comUse Cases:
Limitations:
*.example.com)Distribute traffic across multiple backend servers with configurable weights. This is useful for canary deployments, A/B testing, capacity-based distribution, and blue/green deployments.
;; Weighted default target - distribute across 3 backends
:default-target
[{:ip "10.0.0.1" :port 8080 :weight 50} ; 50% of traffic
{:ip "10.0.0.2" :port 8080 :weight 30} ; 30% of traffic
{:ip "10.0.0.3" :port 8080 :weight 20}] ; 20% of traffic
;; Weighted source routes
:source-routes
[{:source "192.168.0.0/16"
:targets [{:ip "10.0.1.1" :port 8080 :weight 70}
{:ip "10.0.1.2" :port 8080 :weight 30}]}]
Weight Rules:
Canary Deployment Example:
;; 95% stable, 5% canary
:default-target
[{:ip "10.0.0.1" :port 8080 :weight 95} ; Stable version
{:ip "10.0.0.2" :port 8080 :weight 5}] ; Canary version
Blue/Green Deployment Example:
;; Gradual traffic shift from blue to green
:default-target
[{:ip "10.0.0.1" :port 8080 :weight 20} ; Blue (old)
{:ip "10.0.0.2" :port 8080 :weight 80}] ; Green (new)
Automatically detect unhealthy backends and redistribute traffic to healthy ones. Health checking uses virtual threads for efficient concurrent monitoring.
Enable health checking in settings:
:settings
{:health-check-enabled true
:health-check-defaults
{:type :tcp
:interval-ms 10000 ; Check every 10 seconds
:timeout-ms 3000 ; 3 second timeout
:healthy-threshold 2 ; 2 successes = healthy
:unhealthy-threshold 3}} ; 3 failures = unhealthy
Per-target health check configuration:
:default-target
[{:ip "10.0.0.1" :port 8080 :weight 50
:health-check {:type :tcp
:interval-ms 5000
:timeout-ms 2000}}
{:ip "10.0.0.2" :port 8080 :weight 50
:health-check {:type :http
:path "/health"
:interval-ms 5000
:expected-codes [200 204]}}]
Health Check Types:
| Type | Description | Use Case |
|---|---|---|
:tcp | TCP connection test | Fast, low overhead |
:http | HTTP GET with status validation | Application-level health |
:https | HTTPS GET with status validation | Secure endpoints |
:none | Skip health checking | Always considered healthy |
Weight Redistribution:
When a backend becomes unhealthy, its traffic is redistributed proportionally to remaining healthy backends:
Original weights: [50, 30, 20]
If middle server fails: [71, 0, 29] (proportional redistribution)
If all servers fail: [50, 30, 20] (graceful degradation - keep original)
Gradual Recovery:
When a backend recovers, traffic is gradually restored to prevent overwhelming it:
Health Check Parameters:
| Parameter | Default | Description |
|---|---|---|
type | :tcp | Check type (:tcp, :http, :https, :none) |
path | "/health" | HTTP(S) endpoint path |
interval-ms | 10000 | Time between checks (1000-300000) |
timeout-ms | 3000 | Check timeout (100-60000) |
healthy-threshold | 2 | Consecutive successes to mark healthy |
unhealthy-threshold | 3 | Consecutive failures to mark unhealthy |
expected-codes | [200 201 202 204] | Valid HTTP response codes |
Gracefully remove backends from the load balancer by stopping new connections while allowing existing ones to complete. This is essential for zero-downtime deployments, maintenance windows, and rolling updates.
Basic draining:
;; Start draining - no new connections, existing ones continue
(lb/drain-backend! "web" "10.0.0.1:8080")
;; Check drain status
(lb/get-drain-status "10.0.0.1:8080")
;; => {:target-id "10.0.0.1:8080"
;; :status :draining
;; :elapsed-ms 5000
;; :current-connections 3
;; :initial-connections 10}
;; Cancel drain and restore traffic
(lb/undrain-backend! "web" "10.0.0.1:8080")
Draining with timeout and callback:
;; Drain with 60 second timeout and completion callback
(lb/drain-backend! "web" "10.0.0.1:8080"
:timeout-ms 60000
:on-complete (fn [status]
(case status
:completed (println "Drain complete, safe to remove")
:timeout (println "Drain timed out, forcing removal")
:cancelled (println "Drain was cancelled"))))
Synchronous draining (blocks until complete):
;; Block until drain completes or times out
(let [status (lb/wait-for-drain! "10.0.0.1:8080")]
(when (= status :completed)
(lb/remove-backend! "web" "10.0.0.1:8080")))
Rolling update example:
(defn rolling-update [proxy-name targets]
(doseq [target targets]
;; Drain the old instance
(lb/drain-backend! proxy-name target :timeout-ms 30000)
(lb/wait-for-drain! target)
;; Deploy and add new instance
(deploy-new-version! target)
(lb/undrain-backend! proxy-name target)))
Drain Status Values:
| Status | Description |
|---|---|
:draining | Drain in progress, waiting for connections to close |
:completed | All connections closed, drain finished successfully |
:timeout | Timeout expired, some connections may still exist |
:cancelled | Drain cancelled via undrain-backend! |
Configuration:
:settings
{:default-drain-timeout-ms 30000 ; Default timeout (30 seconds)
:drain-check-interval-ms 1000} ; How often to check connection counts
How it works:
drain-backend! sets the target's weight to 0 in BPF mapsUse the load balancer as a library in your Clojure application:
(require '[lb.core :as lb]
'[lb.config :as config])
;; Create configuration
(def cfg (config/make-simple-config
{:interface "eth0"
:port 80
:target-ip "10.0.0.1"
:target-port 8080
:stats-enabled true}))
;; Initialize the load balancer
(lb/init! cfg)
;; Check status
(lb/get-status)
;; => {:running true, :attached-interfaces ["eth0"], ...}
;; Add a source route at runtime (single target)
(lb/add-source-route! "web" "192.168.1.0/24"
{:ip "10.0.0.2" :port 8080})
;; Add a weighted source route at runtime
(lb/add-source-route! "web" "10.10.0.0/16"
[{:ip "10.0.0.3" :port 8080 :weight 70}
{:ip "10.0.0.4" :port 8080 :weight 30}])
;; Get active connections
(lb/get-connections)
;; Print connection statistics
(lb/print-connections)
;; Shutdown
(lb/shutdown!)
;; Add a new proxy at runtime
(lb/add-proxy!
{:name "api"
:listen {:interfaces ["eth0"] :port 8080}
:default-target {:ip "10.0.1.1" :port 3000}})
;; Remove a proxy
(lb/remove-proxy! "api")
;; Add/remove source routes
(lb/add-source-route! "web" "10.20.0.0/16" {:ip "10.0.0.5" :port 8080})
(lb/remove-source-route! "web" "10.20.0.0/16")
;; Add/remove SNI routes (for TLS traffic)
(lb/add-sni-route! "https-gateway" "api.example.com"
{:ip "10.0.1.1" :port 8443})
;; Add weighted SNI route
(lb/add-sni-route! "https-gateway" "web.example.com"
[{:ip "10.0.2.1" :port 8443 :weight 70}
{:ip "10.0.2.2" :port 8443 :weight 30}])
;; Remove SNI route (case-insensitive)
(lb/remove-sni-route! "https-gateway" "api.example.com")
;; List SNI routes
(lb/list-sni-routes "https-gateway")
;; => [{:hostname "web.example.com"
;; :targets [{:ip "10.0.2.1" :port 8443 :weight 70}
;; {:ip "10.0.2.2" :port 8443 :weight 30}]}]
;; List all SNI routes across all proxies
(lb/list-all-sni-routes)
;; Attach/detach interfaces
(lb/attach-interfaces! ["eth1" "eth2"])
(lb/detach-interfaces! ["eth2"])
;; Enable/disable stats
(lb/enable-stats!)
(lb/disable-stats!)
;; Get connection statistics
(lb/get-connection-count)
(lb/get-connection-stats)
;; => {:total-connections 42
;; :total-packets-forward 12345
;; :total-bytes-forward 987654
;; :total-packets-reverse 11234
;; :total-bytes-reverse 876543}
;; Start streaming statistics
(lb/start-stats-stream!)
(let [ch (lb/subscribe-to-stats)]
;; Read events from channel
(async/<! ch))
(lb/stop-stats-stream!)
(require '[lb.health :as health])
;; Start/stop health checking system
(health/start!)
(health/stop!)
(health/running?) ; => true/false
;; Get health status
(health/get-status "web")
;; => {:proxy-name "web"
;; :targets [{:target-id "10.0.0.1:8080"
;; :status :healthy
;; :last-latency-ms 2.5
;; :consecutive-successes 5}
;; {:target-id "10.0.0.2:8080"
;; :status :unhealthy
;; :last-error :connection-refused}]
;; :original-weights [50 50]
;; :effective-weights [100 0]}
(health/get-all-status) ; All proxies
(health/healthy? "web" "10.0.0.1:8080")
(health/all-healthy? "web")
(health/unhealthy-targets "web")
;; Subscribe to health events
(def unsubscribe
(health/subscribe!
(fn [event]
(println "Health event:" (:type event) (:target-id event)))))
;; Events: :target-healthy, :target-unhealthy, :weights-updated
;; Unsubscribe
(unsubscribe)
;; Manual control (for maintenance)
(health/set-target-status! "web" "10.0.0.1:8080" :unhealthy)
(health/force-check! "web" "10.0.0.1:8080")
;; Direct health checks (for testing)
(health/check-tcp "10.0.0.1" 8080 2000)
;; => {:success? true :latency-ms 1.5}
(health/check-http "10.0.0.1" 8080 "/health" 3000 [200])
;; => {:success? true :latency-ms 15.2 :message "HTTP 200"}
;; Format status for display
(health/print-status "web")
(health/print-all-status)
;; Start draining a backend (stops new connections)
(lb/drain-backend! "web" "10.0.0.1:8080")
(lb/drain-backend! "web" "10.0.0.1:8080"
:timeout-ms 60000
:on-complete (fn [status] (println "Drain finished:" status)))
;; Cancel drain and restore traffic
(lb/undrain-backend! "web" "10.0.0.1:8080")
;; Check if target is draining
(lb/draining? "10.0.0.1:8080") ; => true/false
;; Get drain status for a target
(lb/get-drain-status "10.0.0.1:8080")
;; => {:target-id "10.0.0.1:8080"
;; :proxy-name "web"
;; :status :draining
;; :elapsed-ms 5000
;; :timeout-ms 30000
;; :current-connections 3
;; :initial-connections 10}
;; Get all currently draining backends
(lb/get-all-draining)
;; => [{:target-id "10.0.0.1:8080" :status :draining ...}
;; {:target-id "10.0.0.2:8080" :status :draining ...}]
;; Block until drain completes (returns :completed, :timeout, or :cancelled)
(lb/wait-for-drain! "10.0.0.1:8080")
;; Print drain status
(lb/print-drain-status)
+------------------+
| User Space |
| (Clojure App) |
+--------+---------+
|
BPF Maps (shared) |
+------------------------+------------------------+
| | |
v v v
+--------+ +------------+ +----------+
| Listen | | Conntrack | | Settings |
| Map | | Map | | Map |
+--------+ +------------+ +----------+
| | |
+------------------------+------------------------+
|
+------------------------+------------------------+
| | |
v v v
+--------+ +------------+ +----------+
| XDP | Ingress | Kernel | Egress | TC |
| (DNAT) +----------->+ Stack +---------->+ (SNAT) |
+--------+ +------------+ +----------+
^ |
| v
+---+-----------------------------------------------+---+
| Network Interface |
+-------------------------------------------------------+
XDP_PASS to continue normal processing# Run all tests (requires root)
sudo clojure -M:test
The project includes QEMU-based ARM64 testing infrastructure:
# One-time setup (downloads Ubuntu 24.04 ARM64 image)
./qemu-arm64/setup-vm.sh
# Start the ARM64 VM
./qemu-arm64/start-vm.sh --daemon
# Run tests in ARM VM
./qemu-arm64/run-tests-in-vm.sh --sync
# Stop the VM
./qemu-arm64/stop-vm.sh
See qemu-arm64/README.md for detailed ARM64 testing documentation.
clojure -X:uberjar
# Run the uberjar
sudo java -jar target/clj-ebpf-lb.jar -c lb.edn
"Cannot attach XDP program"
sudo)ip link showuname -r (need 5.15+)"BPF program verification failed"
dmesg | tail -50zgrep CONFIG_BPF /proc/config.gz"Connection not being NAT'd"
(lb/list-attached-interfaces)(lb/get-connections)# View attached XDP programs
sudo bpftool prog list
# View BPF maps
sudo bpftool map list
# Monitor traffic
sudo tcpdump -i eth0 -n port 80
max-connections based on expected concurrent connections.MIT License. See LICENSE for details.
Note: This project was previously named clj-ebpf-reverse-proxy. The core functionality remains the same, but the name was changed to better reflect the weighted load balancing capabilities.
Can you improve this documentation?Edit on GitHub
cljdoc builds & hosts documentation for Clojure/Script libraries
| Ctrl+k | Jump to recent docs |
| ← | Move to previous article |
| → | Move to next article |
| Ctrl+/ | Jump to the search field |