Skip to content
This repository has been archived by the owner on Nov 9, 2017. It is now read-only.

Probes and Instrumentation

ztellman edited this page May 4, 2012 · 12 revisions

Computation requires a thread of execution. In synchronous code, this thread is explicit. In asynchronous code this thread is implicit, but no less real. Event-driven programming is the process of applying structure to events until they become ordered.

But what about the inverse operation, where structured execution of code is transformed into a collection of loosely-ordered events? We do this all the time: it’s called logging.

Logging is a lossy operation; log statements lose any context that isn’t explicitly captured. While we can sometimes recreate the context by considering all the log statements collectively, this is akin to archaeology; we’re trying to create a detailed picture of what happened using sparse, unconnected details and educated guesswork.

At the same time, logging is a necessary tool. It lets us understand our code without having to fundamentally change its structure, because the stream of events it generates exists outside our thread of execution.

Lamina gives us powerful tools for transforming, aggregating, and analyzing streams of events. Therefore, if we can transform these log streams into something Lamina can consume, we have a powerful tool for understanding the inner workings of our software.

probe-channels

A probe-channel is a permanent, grounded channel. This means that it cannot be closed or put into an error state, and that messages will never accumulate in its queue. A probe-channel is also named, which means that when we interact with probe-channel abc:def, that will always be the same channel.

Probe-channels represent a stream of events, typically from an instrumented piece of code. The easiest way to enqueue a message into a probe-channel is by using trace:

(trace :foo:bar
  (calculate-pi-to-1e6-digits))

This will send a million digits of pi into the probe-channel returned by (probe-channel :foo:bar). By convention, probe namespaces are colon-delimited. If the argument given to probe-channel is a sequence, they will be joined with colon separators. All of these function calls are equivalent:

(probe-channel :foo:bar)
(probe-channel "foo:bar")
(probe-channel [:foo :bar])
(probe-channel ["foo" :bar])

However, the body of the trace statement will only be executed if something is consuming messages from the probe-channel. Unless we do something with the output of (probe-channel :foo:bar), the above trace statement is effectively a no-op.

We can also consume collections of probe-channels using select-probes. This takes either a wildcard string or a regex, and returns a channel representing the combined output of all probe-channels whose names match that pattern.

> (use 'lamina.core 'lamina.trace)
nil
> (receive-all (select-probes "foo*") #(println "traced:" %))
true
> (trace :foo:bar 1)
traced: 1
true
> (trace :foo 2)
traced: 2
true
> (trace :bar 3)
false

Notice that trace returns true if the message was consumed, and false otherwise.

instrumenting functions

While littering the code with trace statements won’t affect its semantics, it is a bit of an eyesore. Since most of the time meaningful events correlate with function calls, there’s an easy way to automatically instrument functions.

Calling lamina.trace/instrument on a function will create probes for every time the function is called, returns, and throws an exception. These are called enter, return, and error probes respectively.

> (use 'clojure.pprint)
nil
> (defn f [x y] (+ x y))
#'f
> (def f* (instrument f {:name :instrumented}))
#'f*
> (receive-all (probe-channel :instrumented:enter) #(do (println "ENTERED") (pprint %)))
true
> (receive-all (probe-channel :instrumented:return) #(do (println "RETURNED") (pprint %)))
true
> (f* 1 2)
ENTERED
{:timestamp 1336127081901, :args [1 2], :description "instrumented"}
RETURNED
{:timestamp 1336127081901,
 :sub-tasks nil,
 :duration 27000,
 :args [1 2],
 :offset 0,
 :result 3,
 :description "instrumented"}
3

Notice that the we have defined our function with the name :instrumented, so the return probe can be accessed at :instrumented:return.

The fields emitted by the enter probe are:

description the function’s name
timestamp the timestamp, in milliseconds, of the event
args the arguments passed to the function

The additional fields emitted by the return probe are:

result the value returned by the function
duration the elapsed time, in nanoseconds, since the function was called
sub-tasks the return data for any instrumented functions called in the scope of the parent function
offset the elapsed time, in nanoseconds, from the topmost parent function’s invocation to this function’s invocation

Notice that we capture the return data for all other instrumented functions automatically. To explore this, let’s use defn-instrumented, a combination of defn and instrument.

> (defn-instrumented g [x y] (+ x y))
#'g
> (defn-instrumented f
	{:probes {:return (sink->> pprint)}}
	[x y]
	(+ (g x x) (g y y)))

In this, we have two instrumented functions, f and g. We do not need to explicitly define a :name in the function metadata, since that defaults to :the:function:namespace:the-function-name.

We have also defined a :probes map for f, which allows us to consume probes without explicitly calling probe-channel. The keys of the map are the probe suffix (in this case, just :return), and the values are a channel that will consume messages from that probe-channel. For this, we use the sink->> function.

> (f 1 2) 
{:description "user:f",
 :timestamp 1336128984479,
 :offset 0,
 :duration 87000,
 :sub-tasks
 ({:description "user:g",
   :timestamp 1336128984479,
   :offset 20000,
   :duration 26000,
   :sub-tasks nil,
   :args [1 1],
   :result 2}
  {:description "user:g",
   :timestamp 1336128984479,
   :offset 58000,
   :duration 17000,
   :sub-tasks nil,
   :args [2 2],
   :result 4}),
 :args [1 2],
 :result 6}
6

Notice that the :sub-tasks field now has two entries, one for each time g was called. Each of these has their own :args and :result, and an :offset with respect to when f was initially invoked.

Clone this wiki locally