A ring middleware for applying rate limiting policies to HTTP requests.
The middleware is used to implement request rate limits on HTTP endpoints. A key feature is the ability to stack rate limits: multiple instances of the middleware can be wrapped around the same route, e.g. before and after authentication.
The library provides only IP address -based limiting out of the box,
i.e. it's up to the library user to implement other types of rate limits
by implementing the RateLimit protocol. The obvious rate limit to
implement is a user-specific limit.
A storage implementation is used for storing rate limit counters. The
library provides storage implementations for an in-process atom and
Redis, but new storage implementations can be provided easily by
implementing the Storage protocol.
[listora/ring-congestion "0.1.2"]
The middleware is used by wrapping a ring request handler with either
wrap-rate-limit or wrap-stacking-rate-limit. For both functions
the first argument is the ring request handler to wrap, and the second
argument is the configuration for the rate limiting middleware. The
configuration is used to specify the storage backend, the rate limit
being applied by this instance of the middleware, and a response
builder used when the rate limit has been exhausted.
Let's start with the simplest possible use case: limiting requests to
1 req/s per IP address, returning the default 429 - Too Many Requests response when the rate limit is exhausted.
(require '[clj-time.core :as t])
(require '[compojure.core :refer :all])
(require '[congestion.middleware :refer [wrap-rate-limit]])
(require '[congestion.storage :as storage])
;; Instantiate a storage backend
(def storage (storage/local-storage))
;; Define the rate limit: 1 req/s per IP address
(def limit (ip-rate-limit :limit-id 1 (t/seconds 1)))
;; Define the middleware configuration
(def rate-limit-config {:storage storage :limit limit})
;; Wrap the /limit route in the rate limiting middleware
(def app (routes
(GET "/no-limit" [] "no-limit")
(wrap-rate-limit
(GET "/limit" [] "limit")
rate-limit-config)))
Note that the rate limit, ip-rate-limit, takes an identifier, a
number of requests and a time-to-live as arguments. The identifier is
used when referring to the limit counter in the storage backend and
therefore should be unique for each limit. The request count and TTL
together describe how many request can be made within a certain
time-span.
The wrap-rate-limit middleware checks and updates the rate limit
counter before calling the wrapped request handler. This means that an
earlier rate limit is applied before a later rate limit is
checked. For example, an 'unauthenticated' rate limit would be applied
before a 'user-specific' rate limit. This is often not what is wanted,
i.e. an authenticated user would usually have a greater rate limit
than unauthenticated users, but with wrap-rate-limit the
'unauthenticated' limit would get exhausted and further requests would
be denied. The wrap-stacking-rate-limit is provided to address this
issue.
The reason for using wrap-rate-limit rather than the more flexible
wrap-stacking-rate-limit is that since wrap-rate-limit increments
the counter before calling the request handler, there is less of a
chance of concurrent requests being allowed to execute when the rate
limit is already exhausted.
An application's ring middleware stack can have multiple instances of
the wrap-stacking-rate-limit middleware and the rate limits will get
updated in reverse order. That is, each middleware checks if its rate
limit has been exhausted and, if so, denies the request. But if the
limit has not been exhausted the request is delegated to the wrapped
handler allowing subsequent rate limiting middlewares to be applied to
the request. When the wrapped handler returns a response, the rate
limiting middleware checks if a rate limit has been applied, and if
not so, the middleware will increment its own rate limit counter.
Performing the counter update after the request has been handled means
that a subsequent rate limiting middleware can be applied instead of
the current middleware. For example, in the below example code when
the request is authenticated we want the greater user-limit to be
applied rather than the lower unauthenticated-limit. Were the
unauthenticated-limit applied regardless of whether the request
authenticates or not would mean that an authenticated user could only
perform 100 req/h rather than the intended 5000 req/h.
(require '[clj-time.core :as t])
(require '[compojure.core :refer :all])
(require '[congestion.middleware :refer [wrap-rate-limit]])
(require '[congestion.storage :as storage])
;; A custom per-user rate limit
(defrecord UserRateLimit [id quota ttl]
RateLimit
(get-key [self req]
(str (.getName (class self)) id "-" (:user-name req)))
(get-quota [self req]
quota)
(get-ttl [self req]
ttl))
;; Instantiate a storage backend
(def storage (storage/local-storage))
;; Define a limit and config for unauthenticated requests: 100 req/h
;; per IP address
(def unauthenticated-limit (ip-rate-limit :unauthenticated-limit 100 (t/hours 1)))
(def unauthenticated-config {:storage storage :limit unauthenticated-limit})
;; Define a limit and config for authenticated requests: 5000 req/h
;; per user
(def user-limit (->UserRateLimit :user-limit 5000 (t/hours 1)))
(def user-config {:storage storage :limit user-limit})
(defn wrap-authentication
[handler]
(fn [req]
;; TODO: magically authenticate users and attach user name to request
(let [user-name "Bob"
req (assoc req :user-name user-name)]
(handler req))))
(def app (routes
(GET "/no-limit" [] "no-limit")
(->
(ANY "/limit" [] "limit")
(wrap-stacking-rate-limit user-config)
(wrap-authentication)
(wrap-stacking-rate-limit unauthenticated-config))))
Note: we implement a UserRateLimit to be able to perform rate
limiting based on the :user-name field in the request. The key
points are: 1) attaching some data, :user-name in this case, to the
request, and 2) looking that data up in the rate limit
implementation. The get-key function returns a key that is used to
identify the rate limit counter. For a user-specific rate limit that
key should be unique to each user. For an IP address -specific rate
limit the key should be unique to each IP address.
When the rate limit is exhausted, the middleware needs to produce a
ring response to this effect. The library provides a default response
builder, which returns a JSON 429 response:
{:body "{\"error\": \"Too Many Requests\"}"
:headers {"Content-Type" "application/json"
"Retry-After" "Fri, 28 Nov 2014 12:03:55 GMT"}
:status 429}
Most likely the default response is not suitable for your
application. For example, you want to specify a custom Content-Type,
or the response body isn't in the correct format.
The middleware configuration accepts a custom response builder as the
:response-builder key:
(defn custom-response-builder
[quota retry-after]
...)
(wrap-rate-limit app {... :response-builder custom-response-builder})
The response builder takes two arguments: quota and retry-after,
where quota is the number of requests allowed by the limit that has
been exhausted and retry-after is the time when the rate limit
counter will be reset.
The simplest way to build an appropriate 429 - Too Many Requests
response is to call the too-many-requests-response function with a
custom ring response map. The too-many-requests-response function
will add a Retry-After header to the response and set :status to
429 unless it was already set.
But if you want to, you can return whatever response you desire from your custom response builder.
When a rate limit is applied to a request, the library assocs the
quota state to the ring response with the key
:congestion.responses/rate-limit-applied. The quota state is either
AvailableQuota or ExhaustedQuota as defined in
congestion.quota-state.
This serves two purposes: 1) it allows stacked rate limiting middleware to work out if a rate limit has already been applied, and 2) it is helpful in tests and in debugging since we can inspect what rate limit was applied, what the total quota is and how many requests are remaining until the rate limit resets.
But the data in the response is available to any other ring middleware so you can also use it to add custom HTTP headers to responses reporting the total and available requests, if you want to!
The library provides only an IP address rate limit. Other rate limits have to be implemented by the library user. This might change in the future, but at the moment it seemed like there was little benefit in trying to guess what authentication system people use etc. Contributions are welcome!
The RateLimit protocol describes the interface for a rate limit:
(defprotocol RateLimit
(get-key [self req])
(get-quota [self req])
(get-ttl [self req]))
The IpRateLimit is an example of a simple static limit:
(defrecord IpRateLimit [id quota ttl]
RateLimit
(get-key [self req]
(str (.getName (class self)) id "-" (:remote-addr req)))
(get-quota [self req]
quota)
(get-ttl [self req]
ttl))
The key part is returning a key from get-key that is unique within
the context defined by the limit. E.g. for IpRateLimit we want each
unique request IP address to have its own rate limit counter. The
return value is used to identify the rate limit counter in the storage
backend.
For a static limit, like IpRateLimit above, the get-quota and
get-ttl functions both just return the value passed to the limit
during construction time.
A dynamic limit could return different quota and TTL value depending on the request. E.g. we could define a custom rate limit for each user of our application, where both the quota and the TTL would be looked up from the user database after the user has been authenticated.
The easiest way to do this with ring-congestion would be to attach the
rate limit information to the request, e.g. in the authentication
middleware, and then simply look them up from the request in
get-quota and get-ttl:
(defrecord UserRateLimit [id quota ttl]
RateLimit
(get-key [self req]
(str (.getName (class self)) id "-" (:user-name req)))
(get-quota [self req]
(:user-rate-limit-quota req))
(get-ttl [self req]
(:user-rate-limit-ttl req))
The library comes with two storage implementations: local-storage
and redis-storage, but it should be easy to write your own storage
implementation by implementing the Storage protocol by taking
inspiration from the provided storage implementations.
The Storage protocol looks like this:
(defprotocol Storage
(get-count [self key])
(increment-count [self key ttl])
(counter-expiry [self key])
(clear-counters [self]))
The clear-counters function is used to clear all counter state from
the storage which allows the application operator to reset all rate
limits. This is mainly a convenience for the operator in case
something goes wrong with the rate limiting implementation and HTTP
API users are unable to make requests.
The other three function, get-count, increment-count and
counter-expiry, are used to observe and increment the
counters. We'll describe the contracts of all three functions here to
make Storage implementation easier.
get-count is provided with a counter key, generated by calling
get-key on the RateLimit instance, and it is expected to return
the current value of the counter, or 0 if the counter does not
exist.
increment-count is provided again with a counter key, and a
time-to-live, which is a clj-time/JodaTime duration
(e.g. (clj-time.core/hours 1)). The ttl argument is used to
schedule the deletion of the counter after the counter expires, so
it's only really significant if the counter does not exist already.
The LocalStorage storage implementation keeps track of when counters
should expire and purges expired counters when get-count is
called. The RedisStorage storage implementation instead uses a
feature of Redis, the EXPIRE command, to specify when Redis should
delete the counter automatically.
counter-expiry is called in order to get the time-stamp that is used
to generate the Retry-After header for the 429 - Too Many Requests
response.
Note: the RedisStorage storage implementation prefixes all limit
keys with the string congestion-. The idea is to underline that
those Redis keys belong to ring-congestion in cases where the same
Redis instance is used to store other application data as well. Having
a common prefix makes the implementation of the clear-counters
function simple as well.
The Storage protocol provides a clear-counters function for
clearing all rate limit counters from storage. This can be used both
in tests to clear state before and after tests, and by operators to
clear all limits if something goes wrong with rate limiting.
Note: since LocalStorage by default creates a new atom to store
state, it is not possible to clear state by simply creating a new
LocalStorage instance and calling clear-counters on it. Instead
the application must create the atom and hold on to it in order to be
able to clear it later on.
There are a few caveats in rate limiting requests.
IP-based rate limiting is usually based on the remote address of the HTTP request. Unfortunately the remote address is not actually the client's IP address in many cases. For example, any CDN, load balancer, or proxy will mess with the remote address of the request. So you have to be extra careful when applying rate limits on the remote address.
Well behaving proxies usually set or adjust the X-Forwarded-For
header when they forward the request so it's possible in many cases to
pull out the client's real IP address from that header. But if the
number of forwarding proxies is not known or it varies, it's difficult
to implement IP-based rate limiting in such a way that a malicious
client cannot circumvent it by setting the X-Forwarded-For header
themselves. For example, if the application can be accessed both
directly or via a caching proxy, a malicious client could just set the
X-Forwarded-For header to some random IP address and access the
application directly.
This is probably obvious, but we'll mention it just the same: the
LocalStorage storage implementation is super simple and fast to use
in an application for counter storage, but obviously it is only
visible within that instance of the application. If you're running
multiple instances of your application behind a load balancer, each
application will have its own counters. That might be acceptable in
some cases, but usually you'll want to use a database for counter
storage so that counters are shared across all instances of the
application.
Another side effect of caching responses is that any remaining rate limit headers might not be valid when the response is served from a cache. Therefore ring-congestion doesn't set any HTTP headers reporting the total or remaining quota. Fortunately you can do it yourself with a simple ring middleware!
Copyright © 2014 Listora
Distributed under the Eclipse Public License either version 1.0 or (at your option) any later version.
Can you improve this documentation?Edit on GitHub
cljdoc builds & hosts documentation for Clojure/Script libraries
| Ctrl+k | Jump to recent docs |
| ← | Move to previous article |
| → | Move to next article |
| Ctrl+/ | Jump to the search field |