-
Notifications
You must be signed in to change notification settings - Fork 57
Probes and Instrumentation
Computation requires a thread of execution. In synchronous code, this thread is explicit. In asynchronous code this thread is implicit, but no less real. Event-driven programming is the process of applying structure to events until they become ordered.
But what about the inverse operation, where structured execution of code is transformed into a collection of loosely-ordered events? We do this all the time: it’s called logging.
Logging is a lossy operation; log statements lose any context that isn’t explicitly captured. While we can sometimes recreate the context by considering all the log statements collectively, this is akin to archaeology; we’re trying to create a detailed picture of what happened using sparse, unconnected details and educated guesswork.
At the same time, logging is a necessary tool. It lets us understand our code without having to fundamentally change its structure, because the stream of events it generates exists outside our thread of execution.
Lamina gives us powerful tools for transforming, aggregating, and analyzing streams of events. Therefore, if we can transform these log streams into something Lamina can consume, we have a powerful tool for understanding the inner workings of our software.
A probe-channel is a permanent, grounded channel. This means that it cannot be closed or put into an error state, and that messages will never accumulate in its queue. A probe-channel is also named, which means that when we interact with probe-channel abc:def
, that will always be the same channel.
Probe-channels represent a stream of events, typically from an instrumented piece of code. The easiest way to enqueue a message into a probe-channel is by using trace
:
(trace :foo:bar
(calculate-pi-to-1e6-digits))
This will send a million digits of pi into the probe-channel returned by (probe-channel :foo:bar)
. By convention, probe namespaces are colon-delimited. If the argument given to probe-channel
is a sequence, they will be joined with colon separators. All of these function calls are equivalent:
(probe-channel :foo:bar)
(probe-channel "foo:bar")
(probe-channel [:foo :bar])
(probe-channel ["foo" :bar])
However, the body of the trace
statement will only be executed if something is consuming messages from the probe-channel. Unless we do something with the output of (probe-channel :foo:bar)
, the above trace
statement is effectively a no-op.
We can also consume collections of probe-channels using select-probes
. This takes either a wildcard string or a regex, and returns a channel representing the combined output of all probe-channels whose names match that pattern.
> (use 'lamina.core 'lamina.trace)
nil
> (receive-all (select-probes "foo*") #(println "traced:" %))
true
> (trace :foo:bar 1)
traced: 1
true
> (trace :foo 2)
traced: 2
true
> (trace :bar 3)
false
Notice that trace
returns true
if the message was consumed, and false
otherwise.
While littering the code with trace
statements won’t affect its semantics, it is a bit of an eyesore. Since most of the time meaningful events correlate with function calls, there’s an easy way to automatically instrument functions.
Calling lamina.trace/instrument
on a function will create probes for every time the function is called, returns, and throws an exception. These are called enter
, return
, and error
probes respectively.
> (use 'clojure.pprint)
nil
> (defn f [x y] (+ x y))
#'f
> (def f* (instrument f {:name :instrumented}))
#'f*
> (receive-all (probe-channel :instrumented:enter) #(do (println "ENTERED") (pprint %)))
true
> (receive-all (probe-channel :instrumented:return) #(do (println "RETURNED") (pprint %)))
true
> (f* 1 2)
ENTERED
{:timestamp 1336127081901, :args [1 2], :description "instrumented"}
RETURNED
{:timestamp 1336127081901,
:sub-tasks nil,
:duration 27000,
:args [1 2],
:offset 0,
:result 3,
:description "instrumented"}
3
Notice that the we have defined our function with the name :instrumented
, so the return
probe can be accessed at :instrumented:return
.
The fields emitted by the enter
probe are:
description | the function’s name |
timestamp | the timestamp, in milliseconds, of the event |
args | the arguments passed to the function |
The additional fields emitted by the return
probe are:
result | the value returned by the function |
duration | the elapsed time, in nanoseconds, since the function was called |
sub-tasks | the return data for any instrumented functions called in the scope of the parent function |
offset | the elapsed time, in nanoseconds, from the topmost parent function’s invocation to this function’s invocation |
Notice that we capture the return
data for all other instrumented functions automatically. To explore this, let’s use defn-instrumented
, a combination of defn
and instrument
.
> (defn-instrumented g [x y] (+ x y))
#'g
> (defn-instrumented f
{:probes {:return (sink->> pprint)}}
[x y]
(+ (g x x) (g y y)))
In this, we have two instrumented functions, f
and g
. We do not need to explicitly define a :name
in the function metadata, since that defaults to :the:function:namespace:the-function-name
.
We have also defined a :probes
map for f
, which allows us to consume probes without explicitly calling probe-channel
. The keys of the map are the probe suffix (in this case, just :return
), and the values are a channel that will consume messages from that probe-channel. For this, we use the sink->>
function.
> (f 1 2)
{:description "user:f",
:timestamp 1336128984479,
:offset 0,
:duration 87000,
:sub-tasks
({:description "user:g",
:timestamp 1336128984479,
:offset 20000,
:duration 26000,
:sub-tasks nil,
:args [1 1],
:result 2}
{:description "user:g",
:timestamp 1336128984479,
:offset 58000,
:duration 17000,
:sub-tasks nil,
:args [2 2],
:result 4}),
:args [1 2],
:result 6}
6
Notice that the :sub-tasks
field now has two entries, one for each time g
was called. Each of these has their own :args
and :result
, and an :offset
with respect to when f
was initially invoked.