-
Notifications
You must be signed in to change notification settings - Fork 11
Configuration
Cernan has many different options to configure its runtime behaviour. While a few command line flags are available the preferred method to configure cernan is via a toml file. This page discusses the options available for configuration in that format.
The cernan server has options to control which ports and protocols are running ingestion interfaces. These are referred to as 'sources'. By default cernan will listen on UDP:8125 and TCP:2003 for statsd and graphite traffic. In the following, all sources that can be enabled with defaults are:
[sources]
[sources.statsd.primary]
[sources.graphite.primary]
[sources.native.primary]
The full documentation for sources is here.
The cernan server provides mechanisms to transform in-flight telemetry and logs. This mechanism is the 'filter'.
The full documentation is here.
The cernan server has many ways to ship data to external systems. These are called 'sinks'. By default no sinks are enabled. In this mode cernan server doesn't do that much. Sinks are configured individually, by name.
[sinks]
[sinks.console]
[sinks.wavefront]
[sinks.null]
[sinks.firehose.stream_one]
For sinks which support it cernan can report metadata about the metric. These are called "tags" by some aggregators and cernan uses this terminology. In AWS you might choose to include your instance ID with each metric point, as well as the service name. You may set tags like so:
[tags]
source = cernan
Please see Sinks for the list of supported sinks and for sink-specific configuration details.
A forward is a routing distination from one source or filter to potentially many
sinks or filters. Each source must set at least one forward. Forwards are
configured through the forwards
parameter on each source. Consider the
following:
[sources]
[sources.statsd.primary]
enabled = true
port = 8125
forwards = ["sinks.console"]
[sources.statsd.secondary]
enabled = true
port = 8126
forwards = ["sinks.null"]
[sources.graphite.primary]
enabled = true
port = 2004
forwards = ["sinks.null", "sinks.console"]
[sinks]
[sinks.console]
bin_width = 1
[sinks.null]
bin_width = 1
This sets up a cernan to have two statsd sources, running on ports 8125 and 8126, named 'primary' and 'secondary'. Additionally, a sole graphite source is enabled. The primary statsd source will forward all of its metrics to the console sink while the secondary statsd source will forward to the null sink. The graphite source will forward its metrics to all available sinks.
By default cernan's sinks will flush every sixty seconds. You may adjust this
behaviour by modifying the flush-interval
directive:
flush-interval=<INT> How frequently to flush metrics to the sinks in seconds. [default: 60].
The flush-interval
does not affect aggregations. A full discussion of cernan's
aggregation model is discussed in this wiki's data model page. Sinks
will accept independent flush interval configuration but this must be specified with
flush_interval
. Note the underscore.
For sinks which support it cernan can report metadata about the metric. These are called "tags" by some aggregators and cernan uses this terminology. You may configure tags per-sink – see sink documentation – or you may specify global tags to be applied to all sinks. You may set global tags like so:
[tags]
source = "cernan"
hostname = { environment = true, value = "HOSTNAME" }
The first will set the tag source
to have the value cernan
, the second will
set hostname
to have the environment variable value of HOSTNAME
. Each key /
value pair will converted to the appropriate tag, depending on the sink.
Cernan separates its source / filter / sink threads by using a disk-backed mpsc variant called Hopper. In overload conditions hopper will buffer data to disk, keeping cernan's memory use low. Hopper works by writing index files to disk. This option controls the maximum size of these files.
max-hopper-queue-bytes = <INT> Soft-maximum size in bytes of hopper index files [default: 104857600]
The default size of the index files is 100MB. On disk constrained systems with complex event routing you may wish to set this value lower.
By default cernan will put its on-disk queues into TMPDIR. While this is
acceptable for testing and development this is not desirable for production
deployments. You may adjust where cernan stores its on-disk queues with the
data-directory
option:
data-directory = "/var/lib/cernan/"
In the above, we are requiring that cernan store its files in
/var/lib/cernan
. The structure of this data is not defined. Cernan will not
create the path data-directory
points to if it does not exist.
By default cernan will read programmable filters from
/tmp/cernan-scripts
. While this is acceptable for testing and development this
is not desirable for production deployments. You may adjust where cernan
searches for on-disk scripts with the data-directory
option:
scripts-directory = "/etc/cernan/scripts"
Cernan will not create the path scripts-directory
points to if it does not
exist.
In some cases it's not nessary for cernan to ship the aggregates of each point for every second it receives them to achieve a statistically accurate impression of your system. To that end, cernan allows the user to control the width of aggregation bins on a per-source basis. For instance, the following will aggregate points into 1 second bins on the console sink and 10 second bins for the wavefront sink:
[sinks.console]
bin_width = 1
[sinks.wavefront]
bin_width = 10
A Postmates Project
Tech Blog | Twitter @PostmatesDev | Jobs