Skip to content

Latest commit



479 lines (355 loc) · 37 KB

File metadata and controls

479 lines (355 loc) · 37 KB

Agent Configuration

The agent is configured primarily by a YAML document conventionally located at /etc/signalfx/agent.yaml. The location of the config file can be specified by the -config flag to the agent binary (signalfx-agent).

Configuring your endpoint

By default, the Smart Agent will send data to the us0 realm. If you are not in this realm, you will need to explicitly set the signalFxRealm option in your config like this:


To determine if you are in a different realm and need to explicitly set the endpoints, check your profile page in the SignalFx web application.

If you want to explicitly set the ingest, API server, and trace endpoint URLs, you can set them individually like so:

ingestUrl: ""
apiUrl: ""
traceEndpointUrl: ""

They will default to the endpoints for the realm configured in signalFxRealm if not set.

Config Schema

Config option Required Type Description
signalFxAccessToken no string The access token for the org that should receive the metrics emitted by the agent.
ingestUrl no string The URL of SignalFx ingest server. Should be overridden if using the SignalFx Gateway. If not set, this will be determined by the signalFxRealm option below. If you want to send trace spans to a different location, set the traceEndpointUrl option.
traceEndpointUrl no string The full URL (including path) to the trace ingest server. If this is not set, all trace spans will be sent to the same place as ingestUrl above.
apiUrl no string The SignalFx API base URL. If not set, this will determined by the signalFxRealm option below.
signalFxRealm no string The SignalFx Realm that the organization you want to send to is a part of. This defaults to the original realm (us0) but if you are setting up the agent for the first time, you quite likely need to change this. (default: "us0")
hostname no string The hostname that will be reported as the host dimension. If blank, this will be auto-determined by the agent based on a reverse lookup of the machine's IP address.
useFullyQualifiedHost no bool If true (the default), and the hostname option is not set, the hostname will be determined by doing a reverse DNS query on the IP address that is returned by querying for the bare hostname. This is useful in cases where the hostname reported by the kernel is a short name. (default: true)
disableHostDimensions no bool Our standard agent model is to collect metrics for services running on the same host as the agent. Therefore, host-specific dimensions (e.g. host, AWSUniqueId, etc) are automatically added to every datapoint that is emitted from the agent by default. Set this to true if you are using the agent primarily to monitor things on other hosts. You can set this option at the monitor level as well. (default: false)
intervalSeconds no integer How often to send metrics to SignalFx. Monitors can override this individually. (default: 10)
globalDimensions no map of strings Dimensions (key:value pairs) that will be added to every datapoint emitted by the agent. To specify that all metrics should be high-resolution, add the dimension sf_hires: 1
sendMachineID no bool Whether to send the machine-id dimension on all host-specific datapoints generated by the agent. This dimension is derived from the Linux machine-id value. (default: false)
cluster no string The logical environment/cluster that this agent instance is running in. All of the services that this instance monitors should be in the same environment as well. This value, if provided, will be synced as a property onto the host dimension, or onto any cloud-provided specific dimensions (AWSUniqueId, gcp_id, and azure_resource_id) when available. Example values: "prod-usa", "dev"
syncClusterOnHostDimension no bool If true, force syncing of the cluster property on the host dimension, even when cloud-specific dimensions are present. (default: false)
validateDiscoveryRules no bool If true, a warning will be emitted if a discovery rule contains variables that will never possibly match a rule. If using multiple observers, it is convenient to set this to false to suppress spurious errors. (default: true)
observers no list of objects (see below) A list of observers to use (see observer config)
monitors no list of objects (see below) A list of monitors to use (see monitor config)
writer no object (see below) Configuration of the datapoint/event writer
logging no object (see below) Log configuration
collectd no object (see below) Configuration of the managed collectd subprocess
enableBuiltInFiltering no bool If true, the agent will filter out custom metrics without having to rely on the whitelist.json filter that was previously configured under metricsToExclude. Whether a metric is custom or not is documented in each monitor's documentation. If true, every monitor's default configuration (i.e. the minimum amount of configuration to make it work) will only send non-custom metrics. In order to send out custom metrics from a monitor, certain config flags on the monitor must be set or you can specify the metric in the extraMetrics config option on each monitor if you know the specific metric name. You would not have to modify the whitelist via metricsToInclude as before. If you set this option to true, the whitelist.json entry under metricToExclude should be removed, if it is present -- otherwise custom metrics won't be emitted. (default: false)
metricsToInclude no list of objects (see below) A list of metric filters that will whitelist/include metrics. These filters take priority over the filters specified in metricsToExclude.
metricsToExclude no list of objects (see below) A list of metric filters
propertiesToExclude no list of objects (see below) A list of properties filters
internalStatusHost no string The host on which the internal status server will listen. The internal status HTTP server serves internal metrics and diagnostic information about the agent and can be scraped by the internal-metrics monitor. Can be set to if you want to monitor the agent from another host. If you set this to blank/null, the internal status server will not be started. See internalStatusPort. (default: "localhost")
internalStatusPort no integer The port on which the internal status server will listen. See internalStatusHost. (default: 8095)
profiling no bool Enables Go pprof endpoint on port 6060 that serves profiling data for development (default: false)
profilingHost no string The host/ip address for the pprof profile server to listen on. profiling must be enabled for this to have any effect. (default: "")
profilingPort no integer The port for the pprof profile server to listen on. profiling must be enabled for this to have any effect. (default: 6060)
bundleDir no string Path to the directory holding the agent dependencies. This will normally be derived automatically. Overrides the envvar SIGNALFX_BUNDLE_DIR if set.
scratch no any This exists purely to give the user a place to put common yaml values to reference in other parts of the config file.
configSources no object (see below) Configuration of remote config stores
procPath no string Path to the host's /proc filesystem. This is useful for containerized environments. (default: "/proc")
etcPath no string Path to the host's /etc directory. This is useful for containerized environments. (default: "/etc")
varPath no string Path to the host's /var directory. This is useful for containerized environments. (default: "/var")
runPath no string Path to the host's /run directory. This is useful for containerized environments. (default: "/run")
sysPath no string Path to the host's /sys directory. This is useful for containerized environments. (default: "/sys")


The nested observers config object has the following fields:

The following are generic options that apply to all observers. Each observer type has its own set of additional configuration options, detailed in Observer Config.

Config option Required Type Description
type no string The type of the observer


The nested monitors config object has the following fields:

The following are generic options that apply to all monitors. Each monitor type has its own set of additional configuration options, detailed in Monitor Config.

Config option Required Type Description
type no string The type of the monitor
discoveryRule no string The rule used to match up this configuration with a discovered endpoint. If blank, the configuration will be run immediately when the agent is started. If multiple endpoints match this rule, multiple instances of the monitor type will be created with the same configuration (except different host/port).
validateDiscoveryRule no bool If true, a warning will be emitted if a discovery rule contains variables that will never possibly match a rule. If using multiple observers, it is convenient to set this to false to suppress spurious errors. The top-level setting validateDiscoveryRules acts as a default if this isn't set. (default: "false")
extraDimensions no map of strings A set of extra dimensions (key:value pairs) to include on datapoints emitted by the monitor(s) created from this configuration. To specify metrics from this monitor should be high-resolution, add the dimension sf_hires: 1
extraDimensionsFromEndpoint no map of strings A mapping of extra dimension names to a discovery rule expression that is used to derive the value of the dimension. For example, to use a certain container label as a dimension, you could use something like this in your monitor config block: extraDimensionsFromEndpoint: {env: 'Get(container_labels, "")'}
configEndpointMappings no map of strings A set of mappings from a configuration option on this monitor to attributes of a discovered endpoint. The keys are the config option on this monitor and the value can be any valid expression used in discovery rules.
intervalSeconds no integer The interval (in seconds) at which to emit datapoints from the monitor(s) created by this configuration. If not set (or set to 0), the global agent intervalSeconds config option will be used instead. (default: 0)
solo no bool If one or more configurations have this set to true, only those configurations will be considered. This setting can be useful for testing. (default: false)
metricsToExclude no list of objects (see below) DEPRECATED in favor of the datapointsToExclude option. That option handles negation of filter items differently.
datapointsToExclude no list of objects (see below) A list of datapoint filters. These filters allow you to comprehensively define which datapoints to exclude by metric name or dimension set, as well as the ability to define overrides to re-include metrics excluded by previous patterns within the same filter item. See monitor filtering for examples and more information.
disableHostDimensions no bool Some monitors pull metrics from services not running on the same host and should not get the host-specific dimensions set on them (e.g. host, AWSUniqueId, etc). Setting this to true causes those dimensions to be omitted. You can disable this globally with the disableHostDimensions option on the top level of the config. (default: false)
disableEndpointDimensions no bool This can be set to true if you don't want to include the dimensions that are specific to the endpoint that was discovered by an observer. This is useful when you have an endpoint whose identity is not particularly important since it acts largely as a proxy or adapter for other metrics. (default: false)
dimensionTransformations no map of strings A map from dimension names emitted by the monitor to the desired dimension name that will be emitted in the datapoint that goes to SignalFx. This can be useful if you have custom metrics from your applications and want to make the dimensions from a monitor match those. Also can be useful when scraping free-form metrics, say with the prometheus-exporter monitor. Right now, only static key/value transformations are supported. Note that filtering by dimensions will be done on the original dimension name and not the new name. Note that it is possible to remove unwanted dimensions via this configuration, by making the desired dimension name an empty string.
extraMetrics no list of strings Extra metrics to enable besides the default included ones. This is an overridable filter.
extraGroups no list of strings Extra metric groups to enable in addition to the metrics that are emitted by default. A metric group is simply a collection of metrics, and they are defined in each monitor's documentation.


The nested metricsToExclude config object has the following fields:

For more information on filtering see Datapoint Filtering.

Config option Required Type Description
dimensions no map of any A map of dimension key/values to match against. All key/values must match a datapoint for it to be matched. The map values can be either a single string or a list of strings.
metricNames no list of strings A list of metric names to match against
metricName no string A single metric name to match against
monitorType no string (Only applicable for the top level filters) Limits this scope of the filter to datapoints from a specific monitor. If specified, any datapoints not from this monitor type will never match against this filter.
negated no bool (Only applicable for the top level filters) Negates the result of the match so that it matches all datapoints that do NOT match the metric name and dimension values given. This does not negate monitorType, if given. (default: false)


The nested datapointsToExclude config object has the following fields:

Config option Required Type Description
dimensions no map of any A map of dimension key/values to match against. All key/values must match a datapoint for it to be matched. The map values can be either a single string or a list of strings.
metricNames no list of strings A list of metric names to match against
metricName no string A single metric name to match against
monitorType no string (Only applicable for the top level filters) Limits this scope of the filter to datapoints from a specific monitor. If specified, any datapoints not from this monitor type will never match against this filter.
negated no bool (Only applicable for the top level filters) Negates the result of the match so that it matches all datapoints that do NOT match the metric name and dimension values given. This does not negate monitorType, if given. (default: false)


The nested writer config object has the following fields:

Config option Required Type Description
datapointMaxBatchSize no integer The maximum number of datapoints to include in a batch before sending the batch to the ingest server. Smaller batch sizes than this will be sent if datapoints originate in smaller chunks. Larger batch sizes may also be used if the maxRequests requests limit is hit -- the next request will consist of all of the datapoints queued in the meantime. (default: 1000)
maxDatapointsBuffered no integer The maximum number of datapoints that are allowed to be buffered in the agent (i.e. received from a monitor but have not yet received confirmation of successful receipt by the target ingest/gateway server downstream). Any datapoints that come in beyond this number will overwrite existing datapoints if they have not been sent yet, starting with the oldest. (default: 5000)
traceSpanMaxBatchSize no integer The analogue of datapointMaxBatchSize for trace spans. (default: 1000)
datapointMaxRequests no integer Deprecated: use maxRequests instead. (default: 0)
maxRequests no integer The maximum number of concurrent requests to make to a single ingest server with datapoints/events/trace spans. This number multiplied by datapointMaxBatchSize is more or less the maximum number of datapoints that can be "in-flight" at any given time. Same thing for the traceSpanMaxBatchSize option and trace spans. (default: 10)
eventSendIntervalSeconds no integer The agent does not send events immediately upon a monitor generating them, but buffers them and sends them in batches. The lower this number, the less delay for events to appear in SignalFx. (default: 1)
propertiesMaxRequests no unsigned integer The analogue of maxRequests for dimension property requests. (default: 20)
propertiesMaxBuffered no unsigned integer How many dimension property updates to hold pending being sent before dropping subsequent property updates. Property updates will be resent eventually and they are slow to change so dropping them (esp on agent start up) usually isn't a big deal. (default: 10000)
propertiesSendDelaySeconds no unsigned integer How long to wait for property updates to be sent once they are generated. Any duplicate updates to the same dimension within this time frame will result in the latest property set being sent. This helps prevent spurious updates that get immediately overwritten by very flappy property generation. (default: 30)
propertiesHistorySize no unsigned integer Properties that are synced to SignalFx are cached to prevent duplicate requests from being sent, causing unnecessary load on our backend. (default: 10000)
logDatapoints no bool If the log level is set to debug and this is true, all datapoints generated by the agent will be logged. (default: false)
logEvents no bool The analogue of logDatapoints for events. (default: false)
logTraceSpans no bool The analogue of logDatapoints for trace spans. (default: false)
logDimensionUpdates no bool If true, dimension updates will be logged at the INFO level. (default: false)
logDroppedDatapoints no bool If true, and the log level is debug, filtered out datapoints will be logged. (default: false)
sendTraceHostCorrelationMetrics no bool Whether to send host correlation metrics to correlation traced services with the underlying host (default: true)
staleServiceTimeout no int64 How long to wait after a trace span's service name is last seen to continue sending the correlation datapoints for that service. This should be a duration string that is accepted by This option is irrelvant if sendTraceHostCorrelationMetrics is false. (default: "5m")
traceHostCorrelationMetricsInterval no int64 How frequently to send host correlation metrics that are generated from the service name seen in trace spans sent through or by the agent. This should be a duration string that is accepted by This option is irrelvant if sendTraceHostCorrelationMetrics is false. (default: "1m")
maxTraceSpansInFlight no unsigned integer How many trace spans are allowed to be in the process of sending. While this number is exceeded, existing pending spans will be randomly dropped if possible to accommodate new spans generated to avoid memory exhaustion. If you see log messages about "Aborting pending trace requests..." or "Dropping new trace spans..." it means that the downstream target for traces is not able to accept them fast enough. Usually if the downstream is offline you will get connection refused errors and most likely spans will not build up in the agent (there is no retry mechanism). In the case of slow downstreams, you might be able to increase maxRequests to increase the concurrent stream of spans downstream (if the target can make efficient use of additional connections) or, less likely, increase traceSpanMaxBatchSize if your batches are maxing out (turn on debug logging to see the batch sizes being sent) and being split up too much. If neither of those options helps, your downstream is likely too slow to handle the volume of trace spans and should be upgraded to more powerful hardware/networking. (default: 100000)


The nested logging config object has the following fields:

Config option Required Type Description
level no string Valid levels include debug, info, warn, error. Note that debug logging may leak sensitive configuration (e.g. passwords) to the agent output. (default: "info")
format no string The log output format to use. Valid values are: text, json. (default: "text")


The nested collectd config object has the following fields:

Config option Required Type Description
disableCollectd no bool If you won't be using any collectd monitors, this can be set to true to prevent collectd from pre-initializing (default: false)
timeout no integer How many read intervals before abandoning a metric. Doesn't affect much in normal usage. See Timeout. (default: 40)
readThreads no integer Number of threads dedicated to executing read callbacks. See ReadThreads (default: 5)
writeThreads no integer Number of threads dedicated to writing value lists to write callbacks. This should be much less than readThreads because writing is batched in the write_http plugin that writes back to the agent. See WriteThreads. (default: 2)
writeQueueLimitHigh no integer The maximum numbers of values in the queue to be written back to the agent from collectd. Since the values are written to a local socket that the agent exposes, there should be almost no queuing and the default should be more than sufficient. See WriteQueueLimitHigh (default: 500000)
writeQueueLimitLow no integer The lowest number of values in the collectd queue before which metrics begin being randomly dropped. See WriteQueueLimitLow (default: 400000)
logLevel no string Collectd's log level -- info, notice, warning, or err (default: "notice")
intervalSeconds no integer A default read interval for collectd plugins. If zero or undefined, will default to the global agent interval. Some collectd python monitors do not support overridding the interval at the monitor level, but this setting will apply to them. (default: 0)
writeServerIPAddr no string The local IP address of the server that the agent exposes to which collectd will send metrics. This defaults to an arbitrary address in the localhost subnet, but can be overridden if needed. (default: "")
writeServerPort no integer The port of the agent's collectd metric sink server. If set to zero (the default) it will allow the OS to assign it a free port. (default: 0)
configDir no string This is where the agent will write the collectd config files that it manages. If you have secrets in those files, consider setting this to a path on a tmpfs mount. The files in this directory should be considered transient -- there is no value in editing them by hand. If you want to add your own collectd config, see the collectd/custom monitor. (default: "/var/run/signalfx-agent/collectd")


The nested metricsToInclude config object has the following fields:

Config option Required Type Description
dimensions no map of any A map of dimension key/values to match against. All key/values must match a datapoint for it to be matched. The map values can be either a single string or a list of strings.
metricNames no list of strings A list of metric names to match against
metricName no string A single metric name to match against
monitorType no string (Only applicable for the top level filters) Limits this scope of the filter to datapoints from a specific monitor. If specified, any datapoints not from this monitor type will never match against this filter.
negated no bool (Only applicable for the top level filters) Negates the result of the match so that it matches all datapoints that do NOT match the metric name and dimension values given. This does not negate monitorType, if given. (default: false)


The nested metricsToExclude config object has the following fields:

For more information on filtering see Datapoint Filtering.

Config option Required Type Description
dimensions no map of any A map of dimension key/values to match against. All key/values must match a datapoint for it to be matched. The map values can be either a single string or a list of strings.
metricNames no list of strings A list of metric names to match against
metricName no string A single metric name to match against
monitorType no string (Only applicable for the top level filters) Limits this scope of the filter to datapoints from a specific monitor. If specified, any datapoints not from this monitor type will never match against this filter.
negated no bool (Only applicable for the top level filters) Negates the result of the match so that it matches all datapoints that do NOT match the metric name and dimension values given. This does not negate monitorType, if given. (default: false)


The nested propertiesToExclude config object has the following fields:

Config option Required Type Description
propertyName no string A single property name to match (default: "*")
propertyValue no string A property value to match (default: "*")
dimensionName no string A dimension name to match (default: "*")
dimensionValue no string A dimension value to match (default: "*")


The nested configSources config object has the following fields:

For more information about how to use config sources, see Remote Config.

Config option Required Type Description
watch no bool Whether to watch config sources for changes. If this is true and any of the config changes (either the main agent.yaml, or remote config values), the agent will dynamically reconfigure itself with minimal disruption. This is generally better than restarting the agent on config changes since that can result in larger gaps in metric data. The main disadvantage of watching is slightly greater network and compute resource usage. This option is not itself watched for changes. If you change the value of this option, you must restart the agent. (default: true)
file no object (see below) Configuration for other file sources
zookeeper no object (see below) Configuration for a Zookeeper remote config source
etcd2 no object (see below) Configuration for an Etcd 2 remote config source
consul no object (see below) Configuration for a Consul remote config source
vault no object (see below) Configuration for a Hashicorp Vault remote config source


The nested file config object has the following fields:

Config option Required Type Description
pollRateSeconds no integer How often to poll files (in seconds) to test for changes. There are so many edge cases that break inotify that it is more robust to simply poll files than rely on that. This option is not subject to watching and changes to it will require an agent restart. (default: 5)


The nested zookeeper config object has the following fields:

Config option Required Type Description
endpoints no list of strings A list of Zookeeper servers to use for the client
timeoutSeconds no unsigned integer Client timeout (default: 10)


The nested etcd2 config object has the following fields:

Config option Required Type Description
endpoints no list of strings A list of Etcd2 servers to use
username no string An optional username to use when connecting
password no string An optional password to use when connecting


The nested consul config object has the following fields:

Config option Required Type Description
endpoint no string A Consul server URL
username no string An optional username to use when connecting
password no string An optional password to use when connecting
token no string An authentication token, if needed
datacenter no string The Consul datacenter to use


The nested vault config object has the following fields:

Config option Required Type Description
vaultAddr no string The Vault Address. Can also be provided by the standard Vault envvar VAULT_ADDR. This option takes priority over the envvar if provided.
vaultToken no string The Vault token, can also be provided by it the standard Vault envvar VAULT_TOKEN. This option takes priority over the envvar if provided.
kvV2PollInterval no int64 The polling interval for checking KV V2 secrets for a new version. This can be any string value that can be parsed by (default: "60s")
authMethod no string The authetication method to use, if any, to obtain the Vault token. If vaultToken is specified above, this option will have no effect. Currently supported values are: iam.
iam no object (see below) Further config options for the iam auth method. These options are identical to the CLI helper tool options
gcp no object (see below) Further config options for the gcp auth method. These options are identical to the CLI helper tool options. You must provide a valid GCP IAM credential JSON either explicitly via the credentials option (not recommended), or through any GCP Application Default Credentials.


The nested iam config object has the following fields:

Config option Required Type Description
awsAccessKeyId no string Explicit AWS access key ID
awsSecretAccessKey no string Explicit AWS secret access key
awsSecurityToken no string Explicit AWS security token for temporary credentials
headerValue no string Value for the x-vault-aws-iam-server-id header in requests
mount no string Path where the AWS credential method is mounted. This is usually provided via the -path flag in the "vault login" command, but it can be specified here as well. If specified here, it takes precedence over the value for -path. The default value is "aws".
role no string Name of the Vault role to request a token against


The nested gcp config object has the following fields:

Config option Required Type Description
role no string Required. The name of the role you're requesting a token for.
mount no string This is usually provided via the -path flag in the "vault login" command, but it can be specified here as well. If specified here, it takes precedence over the value for -path. Defaults to gcp.
credentials no string Explicitly specified GCP credentials in JSON string format (not recommended)
jwt_exp no integer Time until the generated JWT expires in minutes. The given IAM role will have a max_jwt_exp field, the time in minutes that all valid authentication JWTs must expire within (from time of authentication). Defaults to 15 minutes, the default max_jwt_exp for a role. Must be less than an hour.
service_account no string Service account to generate a JWT for. Defaults to credentials "client_email" if "credentials" specified and this value is not. The actual credential must have the "iam.serviceAccounts.signJWT" permissions on this service account.
project no string Project for the service account who will be authenticating to Vault. Defaults to the credential's "project_id" (if credentials are specified)."

Example YAML

Here is an autogenerated example of a YAML config file, with default values where applicable:

  signalFxRealm: "us0"
  disableHostDimensions: false
  intervalSeconds: 10
  sendMachineID: false
  syncClusterOnHostDimension: false
  validateDiscoveryRules: true
  observers: []
  monitors: []
    datapointMaxBatchSize: 1000
    maxDatapointsBuffered: 5000
    traceSpanMaxBatchSize: 1000
    datapointMaxRequests: 0
    maxRequests: 10
    eventSendIntervalSeconds: 1
    propertiesMaxRequests: 20
    propertiesMaxBuffered: 10000
    propertiesSendDelaySeconds: 30
    propertiesHistorySize: 10000
    logDatapoints: false
    logEvents: false
    logTraceSpans: false
    logDimensionUpdates: false
    logDroppedDatapoints: false
    sendTraceHostCorrelationMetrics: true
    staleServiceTimeout: "5m"
    traceHostCorrelationMetricsInterval: "1m"
    maxTraceSpansInFlight: 100000
    level: "info"
    format: "text"
    disableCollectd: false
    timeout: 40
    readThreads: 5
    writeThreads: 2
    writeQueueLimitHigh: 500000
    writeQueueLimitLow: 400000
    logLevel: "notice"
    intervalSeconds: 0
    writeServerIPAddr: ""
    writeServerPort: 0
    configDir: "/var/run/signalfx-agent/collectd"
  enableBuiltInFiltering: false
  metricsToInclude: []
  metricsToExclude: []
  propertiesToExclude: []
  internalStatusHost: "localhost"
  internalStatusPort: 8095
  profiling: false
  profilingHost: ""
  profilingPort: 6060
    watch: true
      pollRateSeconds: 5
      endpoints: []
      timeoutSeconds: 10
      endpoints: []
      kvV2PollInterval: "60s"
  procPath: "/proc"
  etcPath: "/etc"
  varPath: "/var"
  runPath: "/run"
  sysPath: "/sys"