Releases: DataDog/datadog-agent
6.8.0
Download links
Changes
Prelude
Please note that a critical bug has been identified in this release that would prevent the kubernetes integration from collecting kubelet metrics.
The severity of the issue has led us to remove the packages for the
affected platform (Docker) and to make the latest
tag point to the 6.7.0
release.
If you have upgraded to this version of the containerized agent we recommend you downgrade to 6.7.0
.
Release on: 2018-12-13
-
Please refer to the 6.8.0 tag on integrations-core for the list of changes on the Core Checks.
-
Please refer to the 6.8.0 tag on trace-agent for the list of changes on the Trace Agent.
-
Please refer to the 6.8.0 tag on process-agent for the list of changes on the Process Agent.
The Datadog Agent now automatically look for the container short image name to set the default value for the log source and service.
The source is especially important as it triggers the automatic configuration of your platform with integration pipeline and facets.
The Datadog Agent autodiscovery can still be used to override the default source and service with pod annotations or container labels.
New Features
-
Enable docker config provider if docker.sock exists
-
The new command
datadog-agent config
prints the runtime config of the
agent. -
Adds eBPF-based network collection component called network-tracer.
-
Add diagnosis to the agent for connectivity to the cluster agent
-
datadog-agent integration install
command prevents a user from downgrading an integration
to a version older than the one shipped by default in the agent. -
Adding kerberos support with libkrb5.
-
datadog-agent integration install
command moves configuration files present in
thedata
directory of the wheel upon successful installation
Enhancement Notes
-
Adding a default location on Windows for the file storing pointers to make sure we never lose nor duplicate any logs
-
Add an option to the
agent check
command to run the check n times -
Set service and source to the docker short image name when container_collect_all flag
is enabled and no label or annotation is defined -
Docker: the datadog/dogstatsd image now ships a healthcheck
-
Improved consistency of the ECS and Fargate tagging
-
Improve logging when python checks use invalid types for tags
-
Added a
region
tag to Fargate containers, indicating the AWS region
they run in -
Adds system.cpu.interrupt, and system.mem.committed, system.mem.paged,
system.mem.nonpaged, system.mem.cached metrics on Windows -
Add
permissions.log
file to the flare archive. -
Add an agent go-routine dump to the flare as reported
by the built-in pprof runtime profiling interface. -
The agent can now expose its healthcheck on a dedicated http port.
The Kubernetes daemonset uses this by defaut, on port 5555. -
It's possible now to have different poll intervals for
each autodiscovery configuration providers -
Improve Windows Event parsing. Event.EventData.Data fields are parsed as one JSON object. Event.EventData.Binary field
is parsed to its string value -
Rename the Windows Event "#text" field to "value". This fixes the facet
creation of those fields -
Add a
status.log
and aconfig-check.log
with a basic message in the flare
if the agent is not running or is unreachable. -
Added support for wildcards to
DD_KUBERNETES_POD_LABELS_AS_TAGS
. For example,
DD_KUBERNETES_POD_LABELS_AS_TAGS='{"*":"kube_%%label%%"}'
will all pod labels as
tags to your metrics with tags names prefixed bykube_
.
Upgrade Notes
- The agent now requires a cluster agent version 1.0+ to establish
a valid connection
Deprecation Notes
- Removed support for logs_config.tcp_forward_port as it's no longer needed for other integrations.
Bug Fixes
-
Configure error log when failing to run docker inspect to read as debug instead, as this log is duplicated by the tagger.
-
Fix a bug where
datadog-agent integration
users could not test the
--in-toto
flag due to a filesystem permission issue. -
The cluster agent client init now fails as expected if the
cluster agent URL is not valid -
Print correct error when the
datadog-agent integration
command fails after installing an integration -
Fix build failure on 32bit armv7
-
Fix a bug with Docker logging driver where logs would not be tailed after a log
rotation when the option--log-opt max-file=1
was set. -
Display the correct timezone name in the status page.
-
On Windows, the agent now properly computes the location of ProgramData for
configuration files instead of using hardcoded values
Other Notes
-
JMXFetch upgraded to 0.23.0. See https://github.com/DataDog/jmxfetch/releases/tag/0.23.0
-
On linux, use the cgo dns resolver instead of the golang one. The will make
the agent use glibc to resolve hostnames and should give more predictable
results.
6.7.0
Download links
Changes
This release only ships changes to the trace-agent.
This release focuses on simplifying Trace Search configuration. APM Events can now be configured at the tracer level. Tracers will get updated in the near future to expose this option.
- Please refer to the 6.7.0 tag on trace-agent for the list of changes on the Trace Agent.
6.6.0
Download links
Changes
Prelude
Release on: 2018-10-25
- Please refer to the 6.6.0 tag on integrations-core for the list of changes on the Core Checks.
- Please refer to the 6.6.0 tag on trace-agent for the list of changes on the Trace Agent.
- Please refer to the 6.6.0 tag on process-agent for the list of changes on the Process Agent.
New Features
- Disk check support for the puppy agent on unix-like systems
- Support for the upcoming cluster-agent cluster-level checks feature,
via theclusterchecks
config provider - Add a new CRI core check that will send metrics about resource usage
of your containers via the Container Runtime Interface. - Support SysVinit on Debian note: some warnings can appear if you
enable/disable the agent manually on a systemd system. They can be
safely ignored - The
datadog-agent integration install
command will now check for
compatibility withdatadog-checks-base
shipped with the agent. In
case of mismatch, it will try to rollback to the previously
installed integration version and exit with a failure. - Add
--in-toto
flag todatadog-agent integration
command to
enable in-toto - Add
--verbose
flag todatadog-agent integration
command to
enable verbose logging on pip and TUF - Docker image: running with a read-only root filesystem is now
supported
Enhancement Notes
- Add a setting to configure the interval at which configs should be
polled for autodiscovery. - Support a new config option,
site
, that allows setting the Datadog
site to which the Agent should send data.dd_url
is still
supported and, when set, overridessite
. - Display a warning in the agent status when too many logs are being
tailed and the agent is not tailing them all. This happens with
wildcards in path of the tailed files - Dogstatsd supports removing the hostname on events and services
checks as it did with metrics, by adding an emptyhost:
tag - Added new dogstatsd_tags variable which can be used to specify
additional tags to append to all metrics received by dogstatsd. - dogstatsd cleans up stale UNIX socket on startup.
- The ecs-agent's docker container name can now be set via the
ecs_agent_container_name
option or the
DD_ECS_AGENT_CONTAINER_NAME
envvar for autodetection. - EKS pause containers are ignored by default
- All python and go checks support the new
empty_default_hostname
option to send metrics with no hostname. This is used for
cluster-level checks - All go checks now support the
min_collection_interval
option, as
python check already do - Added a
kubelet_wait_on_missing_container
option to handle hosts
where the kubelet's podlist is slow to update, leading to missing
tags or failing Autodiscovery. Set it to 1 for a 1 second maximum
wait - Add an option to enable protobuf communication with the Kubernetes
apiserver datadog-agent integration
command will not pull any of the
integration's dependencies- More accurate tag extraction logic for Docker Swarm
- Added new command line properties to the Windows installer which
allow for setting site specific configuration.
Bug Fixes
- Fix an issue preventing the exit logs of the agent from displaying
the correct filename. - Fix bug that occurs when checks labels/annotation are misconfigured
and would prevent the logs of the container to be tailed - Fix an issue causing the agent to stop when systemd-journald service
is stopped or fails - Fix deadlock when an config item under
logs
is invalid - Fix system.mem.pct_usable implementation on Linux 3.14+ to match
Datadog Agent 5 - Fix a potential race in the autodiscovery where a service would be
removed before its config could be resolved (causing the agent to
crash) - Fixes crash on Windows when the agent encounters a malformed
performance counter database - Fixes config.Digest that was not stable depending on the oder of
tags in the instance. It also did not take into account LogsConfig,
this is fixed as well. - Fix an issue where the log agent would prevent files from being log
rotated on Windows - Correctly pass the agent's proxy settings to pip when using the
datadog-agent integration
command with TUF enabled. - Recover from errors when connection to the docker socket is lost to
continue tailing containers. - When installing / updating wheels using the
datadog-agent integration
command, we replace the PyPI index with
our own by default, in order to prevent accidental installation of
Datadog or even third-party packages from PyPI. - Remove some undocumented power user options to the
datadog-agent integration
command to prevent accidental
misconfiguration that may reduce security guarantees.
Other Notes
- JMXFetch upgraded to 0.21.0; Adds support for rmi registry
connection over SSL and client authentication. - Use autodiscovery in log-agent kubernetes integration
6.5.2
Docker, Windows, Linux, MacOS
Download links
Changes
Prelude
Release on: 2018-09-20
- Please refer to the 6.5.2 tag on integrations-core for the list of changes on the Core Checks.
- Please refer to the 6.5.2 tag on trace-agent for the list of changes on the Trace Agent.
- Please refer to the 6.5.2 tag on process-agent for the list of changes on the Process Agent.
Bug Fixes
- Fix a crash in the logs package that could occur when a docker tailer initialization failed.
6.5.1
Docker, Windows, Linux
Download links
Changes
Prelude
- Please refer to the 6.5.1 tag on integrations-core for the list of changes on the Core Checks.
- Please refer to the 6.5.1 tag on trace-agent for the list of changes on the Trace Agent.
- Please refer to the 6.5.1 tag on process-agent for the list of changes on the Process Agent.
Bug Fixes
- Fix possible deadlocks that could occur when new docker sources and services are pushed and:
- The docker socket is closed at agent setup
- The docker socket is not mounted
- The kubernetes integration is enabled
- Fix a deadlock that could occur when the logs-agent is enabled and the configuration parameter
logs_config.container_collect_all
or the environment variableDD_LOGS_CONFIG_CONTAINER_COLLECT_ALL
are set to true.
6.5.0
Please note that a critical bug identified in this release affecting container log collection when the container_collect_all
was set, would lead to an agent deadlock. The severity of the issue has led us to remove the packages for the affected platforms (Linux and Docker). If you have upgraded to this version, on Linux or Docker we recommend you downgrade to 6.4.2
.
Prelude
- Please refer to the 6.5.0 tag on integrations-core for the list of changes on the Core Checks.
- Please refer to the 6.5.0 tag on trace-agent for the list of changes on the Trace Agent.
- Please refer to the 6.5.0 tag on process-agent for the list of changes on the Process Agent.
New Features
- Autodiscovery: the
docker
andkubelet
listeners will retry on error, to support starting the agent before your container runtime (host install) - Bump the default number of check runners to 4. This has some concurrency implications as we will now run multiple checks in parallel.
- Kubernetes: to avoid hostname collisions between clusters, a new
cluster_name
option is available. It will be added as a suffix to the host alias detected from the kubelet in order to make these aliases unique across different clusters. - Docker image: handle docker/kubernetes secret files with a helper script.
- The Node Agent can rely on the Datadog Cluster Agent to collect Node Labels.
- Improved ECS fargate tagging:
- Honor the
docker_labels_as_tags
option to extract custom tags - Make the
cluster_name
tag shorter - Add the
short_image
andcontainer_id
tags - Remove some noisy tags
- Fix a lifecycle issue that caused missing tags
- Honor the
- The live containers view can now retrieve containers directly from the kubelet, in order to support containerd and crio
- Kubernetes events: setting event host tags to the related hosts, instead of the host collecting the events.
- Added dedicated configuration parameters to send logs to a proxy by TCP. Note that
logs_config.dd_url
,logs_config.dd_port
andlogs_config.dev_mode_no_ssl
are deprecated and will be unavailable soon, use the new parameterslogs_config.logs_dd_url
andlogs_config.logs_no_ssl
instead. - Added the possibility to send logs to Datadog using the port 443.
Enhancement Notes
- Add more environment variables to the flare whitelist
- When
dd_url
is set toapp.datadoghq.eu
, the infra Agent also sends data to versioned endpoints (similar toapp.datadoghq.com
) - Make all numbers on the status page more human readable (using unit and SI prefix when appropriate)
- Display hostname provider and errors on the status page
- Kubelet Autodiscovery: reduce logging when no change is detected
- On Windows, the hostname_fqdn flag will now be honored, and the host reported by Datadog will be the fully qualified hostname.
- Enable all configuration options to be set with env vars
- Tags generated from GCE metadata may now be omitted by using
collect_gce_tags
configuration option. - Introduction of a new bucketed scheduler to enable multiple check workers to increase concurrency while spreading the load over the collection interval.
- The 'status' command and 'status' page (in the GUI) now displays errors raised by the '__init__' method of a Python check.
- Exclude the rancher pause container in the agent
- On status page, allow users to know which instance of a check matches which yaml instance in configcheck page
- The file_handle check reports 4 new metrics for feature parity with agent 5
- The ntp check will now query multiple servers by default to be more resilient to servers returning wrong offsets. A now config option
hosts
is now available in the ntp check configuration file to
allow users to change the list of ntp servers. - Tags and sources in the tagger-list command are now sorted to ease troubleshooting.
- To allow concurrent execution of subprocess calls from python, we now save the thread state and release the GIL to unblock the interpreter . We can reaquire the GIL and restore the thread state when the subprocess call returns.
- Add a new configuration option, named tag_value_split_separator, allowing the specified list of raw tags to have its value split by a given separator. Only applies to host tags, tags coming from container integrations. Does not apply to tags on dogstatsd metrics, and tags collected by other integrations.
Upgrade Notes
-
Autodiscovery now enforces the ac_exclude and ac_include filtering options for all listeners. Please double-check your exclusion patterns before upgrading and add inclusion patterns if some autodiscovered containers match these.
-
The introduction of multiple runners for checks implies check instances may now run concurrently. This should help the agent make better use of resources, in particular it will help prevent or reduce the side-effects of slow checks delaying the execution of all other checks.
The change will affect custom checks not enforcing thread safety as they may, depending on the schedule, access unsynchronized structures concurrently with the corresponding data race ensuing. If you wish to run checks in a fully sequential fashion, you may set the check_runners option in your datadog.yaml config or via the DD_CHECK_RUNNERS to 1. Also, please feel free to reach out to us if you need more information or help with the new multiple runner/concurrency model.
For more details please read the technical note in the datadog.yaml.
-
Prometheus custom checks are now limited to 2000 metrics by default to provide users control over the maximum number of custom metrics sent in the case of configuration errors or input changes. This limit can be changed with the
max_returned_metrics
option in the check configuration.
Bug Fixes
- All Autodiscovery listeners now enforce the ac_exclude and ac_include filtering options, as described in the documentation.
- Fixed "logs_config.frame_size" override that would not be taken into account.
- collect io metrics for drives with path only (like: C:C0) on Windows
- Fix API_KEY validation for 'additional_endpoints' by using their respective endpoint instead of the main one all the time.
- Fix port ordering for the %%port_%% Autodiscovery tag on the docker listener
- Fix missing ECS tags under some conditions
- Change the name of the agent expvar from
aggregator/ServiceCheckFlushed)
toaggregator/ServiceCheckFlushed
- Fix an issue where logs wouldn't be ingested if the API key contains a trailing new line
- Setting the log level of the
check
subcommand using the-l
flag was not setting the log level of python integrations. - Display embedded Python version in the status page instead of the version from the system Python.
- Fixes a bug causing kube_service tags to be missing when kubernetes_map_services_on_ip is false.
- The ntp check now handles negative offsets if the host time is in the future.
- Fix a possible index out of range panic in Dogstatsd origin detection
- Fix a verbose debug log caused by rescheduling services with no checks associated with them.
Other Notes
- JMXFetch upgraded to 0.20.2; ships updated FasterXML.
- Remove noisy and useless debug log line from contextResolver
6.4.2
Docker, Windows, Linux
Changes
Prelude
Release on: 2018-08-13
- Please refer to the 6.4.2 tag on integrations-core for the list of changes on the Core Checks.
Enhancement Notes
- The flare command does not collect the agent container's environment
variables anymore
Bug Fixes
- Fixes an issue with docker tailing on restart of monitored
containers. Previously, at each container restart the agent would re
submit all logs. Now, on restart we use tracked offsets properly,
and as a result submit only new logs
6.4.1 / 2018-08-01
Docker, Windows, Linux
Download links
Changes
Prelude
Release on: 2018-08-01
- Please refer to the 6.4.1 tag on integrations-core for the list of changes on the Core Checks.
- Please refer to the 6.4.1 tag on trace-agent for the list of changes on the Trace Agent.
- Please refer to the 6.4.1 tag on process-agent for the list of changes on the Process Agent.
New Features
- Create packaging for google cloud launcher integration.
- Add options to exclude specific payloads from being sent to Datadog.
In some environments, some of the gathered information is considered
too sensitive to be sent to Datadog (i.e. IP addresses in events or
service checks). This feature adds to option to exclude specific
payload types from being sent to the backend. - Collect container disk metrics less often in the docker check,
decreasing its effect on performance when enabled. - Autodiscovery now supports the %%hostname%% tag on the docker
listener This tag will resolve to the containers' hostname value if
present in the container inspect. It is useful if the container IP
is not available or erroneous. - Dogstatsd origin detection now supports container tagging for
Kubernetes clusters running containerd or cri-o, in addition to the
existing docker support - This release ships full support of Kubernetes 1.3+
- OpenShift ClusterResourceQuotas metrics are now collected by the
kube_apiserver check, under the openshift.clusterquota.* and
openshift.appliedclusterquota.* names. - Display the version for Python checks on the status page.
Enhancement Notes
- Adding DD_EXPVAR_PORT to the configuration environment variables.
- On Windows, Specifically log to both the log file and the event
viewer what initiated an agent shutdown. Also logs specific startup
errors to both the log file and event viewer. - The embedded Python has been bumped from 2.7.14 to 2.7.15
- Agent expvar metrics now have default values. Metrics like the
number of packets dropped by the agent or errors were previously not
reported until a first event occurred. This should make it easier to
use the expvar configurationagent_stats.yaml
. - Proxy settings can be configured through the environment variables
DD_PROXY_HTTP
,DD_PROXY_HTTPS
andDD_PROXY_NO_PROXY
. These
environment variables take precedence over theproxy
options
configured indatadog.yaml
, and behave exactly the same way as
these options. The standardHTTP_PROXY
,HTTPS_PROXY
and
NO_PROXY
are still honored but have known side effects on
integrations, for simplicity we recommended using the new
environment variables. For more information, please refer to our
proxy docs - Update to distribution metrics algorithm with improved accuracy
- Added ECS pause containers to the default docker exclusion list
- Adding logging for when the agent fails to detect the origin of a
packet in dogstatsd socket mode because of namespace issues. - The
skip_ssl_validation
configuration option can now be set
through the relatedDD_SKIP_SSL_VALIDATION
env var - The Agent will log failed healthchecks on query and during exit
- On Windows, provides installation parameter to set the cmd_port,
the port on which the agent command interface runs. To be used if
the default (5001) is already used by another program. - The kube_service tag is now collected on Kubernetes 1.3.x versions.
The matching uses a new logic. If it were to fail, reverting to the
previous logic is possible by setting the
kubernetes_map_services_on_ip option to true. - The Kubernetes event collection timeout is now configurable
- Logs Agent: Added SOCKS5 proxy support. Use
logs_config: socks5_proxy_address: fqdn.example.com:port
to set
the proxy. - The diagnose output is now sorted by the diagnosis name
- Adding the status of the DCA (If enabled) in the Agent status
command.
Upgrade Notes
- If the environment variables that can be used to configure a proxy
(DD_PROXY_HTTP
,DD_PROXY_HTTPS
,DD_PROXY_NO_PROXY
,
HTTP_PROXY
,HTTPS_PROXY
andNO_PROXY
) are present with an
empty value (e.g.HTTP_PROXY=""
), the Agent now uses this empty
value instead of ignoring it and using lower-precedence options.
Deprecation Notes
- Begin deprecating "Agent start" command. It is being replaced by
"run". The "start" command will continue to function, with a
deprecation notice
Security Issues
- 'app_key' value from the configuration is now redacted when
creating a flare with the agent.
Bug Fixes
- Fixes presence of invalid UTF-8 characters when docker log message
is greater than 16Kb - Fix a possible agent crash due to a race condition in the auto
discovery. - Fixed an issue with jmxfetch not being killed on agent exit.
- Errors logged before the agent initialized the log module are now
printed on STDERR instead of being silenced. - Detect and handle Docker messages without header.
- Fixes installation, packaging scripts for OpenSUSE LEAP and greater.
- In the event of being unable to lock the dd-agent user (eg. dd-agent
is an LDAP user) during installation, do not fail; print relevant
warning. - The leader election process is now restarted if the leader stops
leading. - Avoid Linux package installation failures when both the
initctl
andsystemctl
commands are present but upstart is used as the init
system
Other Notes
- The system information collected from gohai no longer includes
network information when the agent is running in a container since
the network information is for the the container and not the host
itself. - The ntp check now runs every 15 minutes by default to avoid
over-loading the NTP server pools - Added new command "run" to the agent. This command replaces the
"start" command, to reduce ambiguity with the service lifecycle
commands
6.3.3
Docker, Windows, Linux
Changes
Prelude
Release on: 2018-07-16
-
Please refer to the 6.3.3 tag on integrations-core for the list of changes on the Core Checks.
-
Please refer to the 6.3.3 tag on trace-agent for the list of changes on the Trace Agent.
-
Please refer to the 6.3.3 tag on process-agent for the list of changes on the Process Agent.
Enhancements
- Add 'system.mem.buffered' metric on linux system.
Bug Fixes
-
Fix the IO check behavior on unix based on 'iostat' tool:
- Most metrics are an average time, so we don't need to divide again by
'delta' (ex: number of read/time doing read operations) - time is based on the millisecond and not the second
- Most metrics are an average time, so we don't need to divide again by
-
Kubernetes API Server's polling frequency is now customisable.
-
Use as expected the configuration value of kubernetes_metadata_tag_update_freq,
introduce a kubernetes_apiserver_client_timeout configuration option. -
Fix a bug that led the agent to panic in some cases if
thelog_level
configuration option was set toerror
.
6.3.2
Docker, Windows, Linux
Changes
Prelude
Released on: 2018-07-04
- Please refer to the 6.3.2 tag on
integrations-core
for the list of changes on the Core Checks.
Bug Fixes
- The service mapper now groups the mappings of pods to services by
namespace. This prevents kube_service tags from being erroneously
applied to metrics for a pod not targeted by a service but has the
same name as a pod in a different namespace targeted by that
service. - Fix a bug in dogstatsd metrics parsing where the Agent would leave
the host tag empty instead of applying its hostname on metrics with
a tag metadata field but no tags (i.e. the tags field is only one #
character). Regression introduced in 6.3.0 - Replace invalid utf-8 characters by the standard replacement char.