safely
has a comprehensive tracking system.
Prior version
0.7.0
,safely
relied on a library called TRACKit! which in turns it is a wrapper for the popular Dropwizard's Metrics library. So if you are using an older version please refer to the online documentation:safely-0.5.0
tracking
From version 0.7.0
, safely
internally uses a library called
μ/trace for event-based
logging and tracing.
Out of the box by just adding the :track-as
option in a safely block
you get tracing instrumentation of the safely block. The
instrumentation is done at two levels: the inner block the outer
block.
The inner block is the code expression which will be executed potentially multiple times. The outer block is executed only once per evaluation.
For example let's take a look at the following code snippet:
(ns your.ns)
(safely \
|
;; an external call \ |
(users/load-user :id "d548a66b") | Inner |
/ | Outer
:on-error |
:max-retries 3 |
:default nil |
:track-as ::load-user) /
The users/load-user
call is in a safely block, and we have added the
:track-as :your.ns/load-user
option. In this case safely
will
automatically instrument and push events with the following structure:
safely
block execution (outer)So if we assume that in one instance of this execution the inner call failed twice before succeeding the we will have in total 4 events:
:safely/attempt 0
and :safely/attempt 1
)
with :mulog/outcome :error
and the exception generated in the
:exception
value.:safely/attempt 2
) with :mulog/outcome :ok
:safely/call-level :outer
)
with the overall :mulog/outcome :ok
:mulog/timestamp
and the duration
(:mulog/duration
) expressed in nanoseconds.:exception
key will contain the actual exception.:app-name
, :version
and :env
where
set via global context.Here the events with all the details.
{:mulog/event-name :your.ns/load-user,
:mulog/timestamp 1595529674636,
:mulog/trace-id #mulog/flake "4XEQAD_Z_W2zwXpyCPOMnVtsDBXvzzt5",
:mulog/root-trace #mulog/flake "4XEQAD_ZMyVfusmOHGfUXwny4aKKuSSV",
:mulog/parent-trace #mulog/flake "4XEQAD_ZMyVfusmOHGfUXwny4aKKuSSV",
:mulog/duration 259125129,
:mulog/namespace "your.ns",
:mulog/outcome :error,
:app-name "safely-demo",
:env "local",
:exception #error {
:cause "Connection timeout!"
:data {:cause :timeout}
:via
[{:type clojure.lang.ExceptionInfo
:message "Connection timeout!"
:data {:cause :timeout}
:at [clojure.core$ex_info invokeStatic "core.clj" 4617]}]
:trace
[[..full stack trace..]
[...] ;; stack omitted for brevity
[clojure.lang.AFn run "AFn.java" 22]
[java.lang.Thread run "Thread.java" 832]]},
:version "1.2.3",
:safely/attempt 0,
:safely/call-level :inner,
:safely/call-site "your.ns[l:2264, c:5]",
:safely/call-type :direct,
:safely/max-retries 3}
{:mulog/event-name :your.ns/load-user,
:mulog/timestamp 1595529675338,
:mulog/trace-id #mulog/flake "4XEQAGBpt66OLi5Ut97GRRJvLzjdgBrI",
:mulog/root-trace #mulog/flake "4XEQAD_ZMyVfusmOHGfUXwny4aKKuSSV",
:mulog/parent-trace #mulog/flake "4XEQAD_ZMyVfusmOHGfUXwny4aKKuSSV",
:mulog/duration 406038359,
:mulog/namespace "your.ns",
:mulog/outcome :error,
:app-name "safely-demo",
:env "local",
:exception #error {
:cause "Connection timeout!"
:data {:cause :timeout}
:via
[{:type clojure.lang.ExceptionInfo
:message "Connection timeout!"
:data {:cause :timeout}
:at [clojure.core$ex_info invokeStatic "core.clj" 4617]}]
:trace
[[..full stack trace..]
[...] ;; stack omitted for brevity
[clojure.lang.AFn run "AFn.java" 22]
[java.lang.Thread run "Thread.java" 832]]},
:version "1.2.3",
:safely/attempt 1,
:safely/call-level :inner,
:safely/call-site "your.ns[l:2264, c:5]",
:safely/call-type :direct,
:safely/max-retries 3}
{:mulog/event-name :your.ns/load-user,
:mulog/timestamp 1595529676867,
:mulog/trace-id #mulog/flake "4XEQALtOcmYzqPzVyYXnG84N4ykloTky",
:mulog/root-trace #mulog/flake "4XEQAD_ZMyVfusmOHGfUXwny4aKKuSSV",
:mulog/parent-trace #mulog/flake "4XEQAD_ZMyVfusmOHGfUXwny4aKKuSSV",
:mulog/duration 340710808,
:mulog/namespace "your.ns",
:mulog/outcome :ok,
:app-name "safely-demo",
:env "local",
:version "1.2.3",
:safely/attempt 2,
:safely/call-level :inner,
:safely/call-site "your.ns[l:2264, c:5]",
:safely/call-type :direct,
:safely/max-retries 3}
{:mulog/event-name :your.ns/load-user,
:mulog/timestamp 1595529674636,
:mulog/trace-id #mulog/flake "4XEQAD_ZMyVfusmOHGfUXwny4aKKuSSV",
:mulog/root-trace #mulog/flake "4XEQAD_ZMyVfusmOHGfUXwny4aKKuSSV",
:mulog/duration 2571433163,
:mulog/namespace "your.ns",
:mulog/outcome :ok,
:app-name "safely-demo",
:env "local",
:version "1.2.3",
:safely/call-level :outer,
:safely/call-site "your.ns[l:2264, c:5]",
:safely/circuit-breaker nil}
μ/log has the possibility to publish all these events to various systems. It is recommended to use Elasticsearch to search and quickly aggregates quantitative measure across all the events, but there are a number of supported systems where you can send the events for further processing, aggregation, alerting, please refer to μ/log documentation for a complete list of available publishers.
One of the systems that works well with safely
is
Zipkin (or any OpenZipkin compatible system) for
distributed tracking. All safely
blocks are automatically
instrumented with μ/trace which provides events in a format
which is compatible to distributed tracing, hence, by activating the
Zipkin publisher as follow:
(μ/start-publisher!
{:type :zipkin
:url "http://localhost:9411/"})
you can get the above events in the form of traces:
Each blueish block (span) corresponds to an attempt, the size of the
span correspond to the :mulog/duration
and the top level span is the
outer call. All the additional information available in the events will be pushed as tags
and therefore can be queried by the Zipkin interface.
If you are using a circuit-breaker with safely
the following
additional metrics are published.
For example let's take a look at the following code snippet:
(safely
;; an external call
(users/load-user :id "d548a66b")
:on-error
:max-retries 3
:default nil
:track-as ::load-user
:circuit-breaker :user-service)
In this case in addition to all the previous information, each event will contain also the following keys:
;; which circuit-breaker is in use with the block
:safely/circuit-breaker :user-service
;; if a timeout is set via the `:timeout` option it will
;; appear in the `:inner` calls, `nil` otherwise.
:safely/timeout nil
;; if the circuit breaker is blocking a call or is successful
:safely/circuit-breaker-outcome :success
The key :safely/circuit-breaker-outcome
will have one of the following values:
:success
- inner block successfully executed:execution-error
- refers to an error in the execution of the inner block.:timeout
- the inner block execution didn't complete in time.:queue-full
- when the processing queue can't accept more items.:circuit-open
- when the circuit breaker is open
and all
requests are rejected for a certain amount of time.Can you improve this documentation?Edit on GitHub
cljdoc is a website building & hosting documentation for Clojure/Script libraries
× close