Skip to content

Commit

Permalink
Merge branch 'master' into kyleneale/bump_python_312
Browse files Browse the repository at this point in the history
  • Loading branch information
Kyle-Neale authored Sep 4, 2024
2 parents 8252ce8 + d977500 commit ab77031
Show file tree
Hide file tree
Showing 60 changed files with 5,186 additions and 90 deletions.
2 changes: 1 addition & 1 deletion .ddev/config.toml
Original file line number Diff line number Diff line change
Expand Up @@ -167,7 +167,7 @@ exclude = [
'ddtrace', # https://github.com/DataDog/integrations-core/pull/9132
'foundationdb', # Breaking datadog_checks_base tests
'pyasn1', # https://github.com/pyasn1/pyasn1/issues/52
'pysmi', # pysnmp dependent on pysmi version > 0.4.8; < 0.5.0
'pysmi', # pysnmp dependent on pysmi version 1.2.1
'pysnmp', # Breaking snmp tests
'aerospike', # v8+ breaks agent build.
# https://github.com/DataDog/integrations-core/pull/16080
Expand Down
22 changes: 11 additions & 11 deletions agent_requirements.in
Original file line number Diff line number Diff line change
Expand Up @@ -6,21 +6,21 @@ beautifulsoup4==4.12.3; python_version > '3.0'
beautifulsoup4==4.9.3; python_version < '3.0'
binary==1.0.0
boto3==1.17.112; python_version < '3.0'
boto3==1.34.153; python_version > '3.0'
boto3==1.35.10; python_version > '3.0'
boto==2.49.0
botocore==1.20.112; python_version < '3.0'
botocore==1.34.153; python_version > '3.0'
botocore==1.35.10; python_version > '3.0'
cachetools==3.1.1; python_version < '3.0'
cachetools==5.4.0; python_version > '3.0'
cachetools==5.5.0; python_version > '3.0'
clickhouse-cityhash==1.0.2.3; python_version < '3.0'
clickhouse-cityhash==1.0.2.4; python_version > '3.0'
clickhouse-driver==0.2.0; python_version < '3.0'
clickhouse-driver==0.2.8; python_version > '3.0'
clickhouse-driver==0.2.9; python_version > '3.0'
cm-client==45.0.4
confluent-kafka==2.5.0; python_version > '3.0'
contextlib2==0.6.0.post1; python_version < '3.0'
cryptography==3.3.2; python_version < '3.0'
cryptography==42.0.8; python_version > '3.0'
cryptography==43.0.0; python_version > '3.0'
ddtrace==0.32.2; sys_platform == 'win32' and python_version < '3.0'
ddtrace==0.53.2; sys_platform != 'win32' and python_version < '3.0'
ddtrace==2.10.6; python_version > '3.0'
Expand All @@ -45,10 +45,10 @@ mmh3==4.1.0; python_version > '3.0'
oauthlib==3.1.0; python_version < '3.0'
oauthlib==3.2.2; python_version > '3.0'
openstacksdk==3.3.0; python_version > '3.0'
orjson==3.10.6; python_version > '3.0'
orjson==3.10.7; python_version > '3.0'
packaging==24.1; python_version > '3.0'
paramiko==2.12.0; python_version < '3.0'
paramiko==3.4.0; python_version > '3.0'
paramiko==3.4.1; python_version > '3.0'
ply==3.11
prometheus-client==0.12.0; python_version < '3.0'
prometheus-client==0.20.0; python_version > '3.0'
Expand Down Expand Up @@ -82,7 +82,7 @@ pyvmomi==8.0.3.0.1; python_version > '3.0'
pywin32==228; sys_platform == 'win32' and python_version < '3.0'
pywin32==306; sys_platform == 'win32' and python_version > '3.0'
pyyaml==5.4.1; python_version < '3.0'
pyyaml==6.0.1; python_version > '3.0'
pyyaml==6.0.2; python_version > '3.0'
redis==3.5.3; python_version < '3.0'
redis==5.0.8; python_version > '3.0'
requests-kerberos==0.12.0; python_version < '3.0'
Expand All @@ -92,7 +92,7 @@ requests-ntlm==1.3.0; python_version > '3.0'
requests-oauthlib==1.3.1; python_version < '3.0'
requests-oauthlib==2.0.0; python_version > '3.0'
requests-toolbelt==1.0.0
requests-unixsocket2==0.4.1; python_version > '3.0'
requests-unixsocket2==0.4.2; python_version > '3.0'
requests-unixsocket==0.3.0; python_version < '3.0'
requests==2.27.1; python_version < '3.0'
requests==2.32.3; python_version > '3.0'
Expand All @@ -103,9 +103,9 @@ semver==2.13.0; python_version < '3.0'
semver==3.0.2; python_version > '3.0'
service-identity[idna]==21.1.0; python_version < '3.0'
service-identity[idna]==24.1.0; python_version > '3.0'
simplejson==3.19.2
simplejson==3.19.3
six==1.16.0
snowflake-connector-python==3.12.0; python_version > '3.0'
snowflake-connector-python==3.12.1; python_version > '3.0'
supervisor==4.2.5
tuf==4.0.0; python_version > '3.0'
typing==3.10.0.0; python_version < '3.0'
Expand Down
1 change: 1 addition & 0 deletions amazon_msk/changelog.d/18478.added
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Update dependencies
2 changes: 1 addition & 1 deletion amazon_msk/pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ license = "BSD-3-Clause"
[project.optional-dependencies]
deps = [
"boto3==1.17.112; python_version < '3.0'",
"boto3==1.34.153; python_version > '3.0'",
"boto3==1.35.10; python_version > '3.0'",
]

[project.urls]
Expand Down
6 changes: 6 additions & 0 deletions cisco_aci/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,12 @@

<!-- towncrier release notes start -->

## 2.10.2 / 2024-09-02

***Fixed***:

* [NDM] [Cisco ACI] Use actual int for interface index ([#18414](https://github.com/DataDog/integrations-core/pull/18414))

## 2.10.1 / 2024-08-20

***Fixed***:
Expand Down
1 change: 0 additions & 1 deletion cisco_aci/changelog.d/18414.fixed

This file was deleted.

1 change: 1 addition & 0 deletions cisco_aci/changelog.d/18478.added
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Update dependencies
2 changes: 1 addition & 1 deletion cisco_aci/datadog_checks/cisco_aci/__about__.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,4 @@
# All rights reserved
# Licensed under a 3-clause BSD style license (see LICENSE)

__version__ = "2.10.1"
__version__ = "2.10.2"
2 changes: 1 addition & 1 deletion cisco_aci/pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ license = "BSD-3-Clause"
[project.optional-dependencies]
deps = [
"cryptography==3.3.2; python_version < '3.0'",
"cryptography==42.0.8; python_version > '3.0'",
"cryptography==43.0.0; python_version > '3.0'",
]

[project.urls]
Expand Down
1 change: 1 addition & 0 deletions clickhouse/changelog.d/18478.added
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Update dependencies
2 changes: 1 addition & 1 deletion clickhouse/pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ deps = [
"clickhouse-cityhash==1.0.2.3; python_version < '3.0'",
"clickhouse-cityhash==1.0.2.4; python_version > '3.0'",
"clickhouse-driver==0.2.0; python_version < '3.0'",
"clickhouse-driver==0.2.8; python_version > '3.0'",
"clickhouse-driver==0.2.9; python_version > '3.0'",
"lz4==2.2.1; python_version < '3.0'",
"lz4==4.3.3; python_version > '3.0'",
]
Expand Down
1 change: 1 addition & 0 deletions datadog_checks_base/changelog.d/18478.added
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Update dependencies
15 changes: 7 additions & 8 deletions datadog_checks_base/pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -40,13 +40,12 @@ db = [
deps = [
"binary==1.0.0",
"cachetools==3.1.1; python_version < '3.0'",
"cachetools==5.4.0; python_version > '3.0'",
"cachetools==5.5.0; python_version > '3.0'",
"contextlib2==0.6.0.post1; python_version < '3.0'",
"cryptography==3.3.2; python_version < '3.0'",
"cryptography==42.0.8; python_version > '3.0'",
"cryptography==43.0.0; python_version > '3.0'",
"ddtrace==0.32.2; sys_platform == 'win32' and python_version < '3.0'",
"ddtrace==0.53.2; sys_platform != 'win32' and python_version < '3.0'",
# https://github.com/DataDog/dd-trace-py/issues/10002
"ddtrace==2.10.6; python_version > '3.0'",
"enum34==1.1.10; python_version < '3.0'",
"importlib-metadata==2.1.3; python_version < '3.8'",
Expand All @@ -61,13 +60,13 @@ deps = [
"pywin32==228; sys_platform == 'win32' and python_version < '3.0'",
"pywin32==306; sys_platform == 'win32' and python_version > '3.0'",
"pyyaml==5.4.1; python_version < '3.0'",
"pyyaml==6.0.1; python_version > '3.0'",
"pyyaml==6.0.2; python_version > '3.0'",
"requests-toolbelt==1.0.0",
"requests-unixsocket2==0.4.1; python_version > '3.0'",
"requests-unixsocket2==0.4.2; python_version > '3.0'",
"requests-unixsocket==0.3.0; python_version < '3.0'",
"requests==2.27.1; python_version < '3.0'",
"requests==2.32.3; python_version > '3.0'",
"simplejson==3.19.2",
"simplejson==3.19.3",
"six==1.16.0",
"typing==3.10.0.0; python_version < '3.0'",
"uptime==3.0.1",
Expand All @@ -77,7 +76,7 @@ deps = [
http = [
"aws-requests-auth==0.4.3",
"botocore==1.20.112; python_version < '3.0'",
"botocore==1.34.153; python_version > '3.0'",
"botocore==1.35.10; python_version > '3.0'",
"oauthlib==3.1.0; python_version < '3.0'",
"oauthlib==3.2.2; python_version > '3.0'",
"pyjwt==1.7.1; python_version < '3.0'",
Expand All @@ -93,7 +92,7 @@ http = [
"win-inet-pton==1.1.0; sys_platform == 'win32' and python_version < '3.0'",
]
json = [
"orjson==3.10.6; python_version > '3.0'",
"orjson==3.10.7; python_version > '3.0'",
]
kube = [
"kubernetes==18.20.0; python_version < '3.0'",
Expand Down
1 change: 1 addition & 0 deletions docs/developer/base/api.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@
- send_log
- get_log_cursor
- warning
- http

## Stubs

Expand Down
46 changes: 46 additions & 0 deletions docs/developer/base/logs-crawlers.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
# Log Crawlers

## Overview

Some systems expose their logs from HTTP endpoints instead of files that the Logs Agent can tail.
In such cases, you can create an Agent integration to crawl the endpoints and submit the logs.

The following diagram illustrates how crawling logs integrates into the Datadog Agent.

<div align="center" markdown="1">

```mermaid
graph LR
subgraph "Agent Integration (you write this)"
A[Log Stream] -->|Log Records| B(Log Crawler Check)
end
subgraph Agent
B -->|Save Logs| C[(Log File)]
D(Logs Agent) -->|Tail Logs| C
end
D -->|Submit Logs| E(Logs Intake)
```

</div>

## Interface

::: datadog_checks.base.checks.logs.crawler.base.LogCrawlerCheck
options:
heading_level: 3
members:
- get_log_streams
- process_streams
- check

::: datadog_checks.base.checks.logs.crawler.stream.LogStream
options:
heading_level: 3
members:
- records
- __init__

::: datadog_checks.base.checks.logs.crawler.stream.LogRecord
options:
heading_level: 3
members: []
149 changes: 149 additions & 0 deletions docs/developer/tutorials/logs/http-crawler.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,149 @@
# Submit Logs from HTTP API

## Getting Started

This tutorial assumes you have done the following:

- [Set up your environment](../../index.md#getting-started).
- Read the [logs crawler documentation](../../base/logs-crawlers.md).
- Read about the [HTTP capabilities](../../base/http.md) of the base class.

Let's say we are building an integration for an API provided by *ACME Inc.*
Run the following command to create the scaffolding for our integration:

```
ddev create ACME
```

This adds a folder called `acme` in our `integrations-core` folder.
The rest of the tutorial we will spend in the `acme` folder.
```
cd acme
```

In order to spin up the integration in our scaffolding, if we add the following to `tests/conftest.py`:

```python
@pytest.fixture(scope='session')
def dd_environment():
yield {'tags': ['tutorial:acme']}
```

Then run:
```
ddev env start acme py3.11 --dev
```

## Define an Agent Check

We start by registering an implementation for our integration.
At first it is empty, we will expand on it step by step.

Open `datadog_checks/acme/check.py` in our editor and put the following there:

```python
from datadog_checks.base.checks.logs.crawler.base import LogCrawlerCheck


class AcmeCheck(LogCrawlerCheck):
__NAMESPACE__ = 'acme'
```

Now we'll run something we will refer to as *the check command*:
```
ddev env agent acme py3.11 check
```

We'll see the following error:
```
Can't instantiate abstract class AcmeCheck with abstract method get_log_streams
```

We need to define the `get_log_streams` method.
As [stated in the docs](../../base/logs-crawlers.md#datadog_checks.base.checks.logs.crawler.base.LogCrawlerCheck.get_log_streams), it must return an iterator over `LogStream` subclasses.
The next section describes this further.

## Define a Stream of Logs

In the same file, add a `LogStream` subclass and return it (wrapped in a list) from `AcmeCheck.get_log_streams`:

```python
from datadog_checks.base.checks.logs.crawler.base import LogCrawlerCheck
from datadog_checks.base.checks.logs.crawler.stream import LogStream

class AcmeCheck(LogCrawlerCheck):
__NAMESPACE__ = 'acme'

def get_log_streams(self):
return [AcmeLogStream(check=self, name='ACME log stream')]

class AcmeLogStream(LogStream):
"""Stream of Logs from ACME"""
```

Now running *the check command* will show a new error:

```
TypeError: Can't instantiate abstract class AcmeLogStream with abstract method records
```

Once again we need to define a method, this time [`LogStream.records`](../../base/logs-crawlers.md#datadog_checks.base.checks.logs.crawler.stream.LogStream.records).
This method accepts a `cursor` argument.
We ignore this argument for now and explain it later.


```python
from datadog_checks.base.checks.logs.crawler.stream import LogRecord, LogStream
from datadog_checks.base.utils.time import get_timestamp

... # Skip AcmeCheck to focus on LogStream.


class AcmeLogStream(LogStream):
"""Stream of Logs from ACME"""

def records(self, cursor=None):
return [
LogRecord(
data={'message': 'This is a log from ACME.', 'level': 'info'},
cursor={'timestamp': get_timestamp()},
)
]
```

There are several things going on here.
`AcmeLogStream.records` returns an iterator over `LogRecord` objects.
For simplicity here we return a list with just one record.
After we understand what each `LogRecord` looks like we can discuss how to generate multiple records.

### What is a Log Record?

The `LogRecord` class has 2 fields.
In `data` we put any data in here that we want to submit as a log to Datadog.
In `cursor` we store a unique identifier for this specific `LogRecord`.

We use the `cursor` field to checkpoint our progress as we scrape the external API.
In other words, every time our integration completes its run we save the last cursor we submitted.
We can then resume scraping from this cursor.
That's what the `cursor` argument to the `records` method is for.
The very first time the integration runs this `cursor` is `None` because we have no checkpoints.
For every subsequent integration run, the `cursor` will be set to the `LogRecord.cursor` of the last `LogRecord` yielded or returned from `records`.

Some things to consider when defining cursors:

- Use UTC time stamps!
- Only using the timestamp as a unique identifier may not be enough. We can have different records with the same timestamp.
- One popular identifier is the order of the log record in the stream. Whether this works or not depends on the API we are crawling.


### Scraping for Log Records

In our toy example we returned a list with just one record.
In practice we will need to create a list or lazy iterator over `LogRecord`s.
We will construct them from data that we collect from the external API, in this case the one from *ACME*.

Below are some tips and considerations when scraping external APIs:

1. Use the `cursor` argument to checkpoint your progress.
1. The Agent schedules an integration run approximately every 10-15 seconds.
1. The intake won't accept logs that are older than 18 hours. For better performance skip such logs as you generate `LogRecord` items.
Loading

0 comments on commit ab77031

Please sign in to comment.