Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metering connector actions request body bytes #186804

Merged
merged 41 commits into from
Aug 27, 2024

Conversation

ersin-erdal
Copy link
Contributor

@ersin-erdal ersin-erdal commented Jun 24, 2024

Towards: https://github.com/elastic/response-ops-team/issues/209

This PR collects body-bytes from the requests made by the connectors to the 3rd parties and saves them in the event-log.

There is a new metric collector: ConnectorMetricsCollector
Action TaskRunner, creates a new instance of it on each action execution and passes it to the actionType.executor.
Then the actionType.executor passes it to the request function provided by the actions plugin.
Request function passes the response (or the error) from axios to addRequestBodyBytes method of the ConnectorMetricsCollector.
Since axios always returns request.headers['Content-Length'] either in success result or error, metric collector uses its value to get the request body bytes.

In case there is no Content-Length header, addRequestBodyBytes method fallbacks to the body object that we pass as the second param. So It calculates the body bytes by using Buffer.byteLength(body, 'utf8');, which is also used by axios to populate request.headers['Content-Length']

For the connectors or the subActions that we don't use the request function or axios:
addRequestBodyBytes method is called just before making the request only with the body param in order to force it to use the fallback.

Note: If there are more than one requests in an execution, the bytes are summed.

To verify:

Create a rule with a connector that you would like to test.
Let the rule run and check the event-log of your connector, request body bytes should be saved in:
kibana.action.execution.metrics.request_body_bytes

Alternatively:
You can create a connector and run it on its test tab.

You can use the below query to check the event-log:

 {
    "query": {
      "bool": { 
        "must": [
          { "match": { "event.provider":"actions"}},
          { "match": { "kibana.action.type_id":"{**your-action-type-id**}"}} 
        ],
        "filter": [ 
          { "term":  { "event.action": "execute" }}
        ]
      }
    },
    "size" : 100
}

@ersin-erdal ersin-erdal added Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) v8.15.0 labels Jun 24, 2024
@ersin-erdal ersin-erdal self-assigned this Jun 24, 2024
@ersin-erdal ersin-erdal force-pushed the 209-meter-request-bytes branch from 0d8dd42 to 1981342 Compare July 7, 2024 18:56
@ersin-erdal ersin-erdal force-pushed the 209-meter-request-bytes branch from e898038 to 4d180d6 Compare July 9, 2024 17:27
@ersin-erdal
Copy link
Contributor Author

/ci

@ersin-erdal ersin-erdal marked this pull request as ready for review July 9, 2024 23:34
@ersin-erdal ersin-erdal requested review from a team as code owners July 9, 2024 23:34
@elasticmachine
Copy link
Contributor

Pinging @elastic/response-ops (Team:ResponseOps)

@ersin-erdal ersin-erdal added the release_note:skip Skip the PR/issue when compiling release notes label Jul 9, 2024
zipPassCode,
}: SentinelOneFetchAgentFilesParams) {
const agent = await this.getAgents({ uuid: agentUUID });
public async fetchAgentFiles(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ersin-erdal ,
I read the PR description and just want to make sure that the changes done to these public methods have no impact on consumers of the COnnector's sub-actions - correct? The second argument (ConnectorMetricsCollector) is something that the Action task runner will take care of internally?

Also,
Is there a way you can trigger the Security Solution tests (unit) to run on PR? They only currently run if files from the Security Solution plugin are changed, thus they did not run on this one. If needed, you can make a change (a JS comment would do, I think) to one of our files - perhaps one these:

That should trigger CI to run our full test suite.

cc/ @tomsonpl - just FYI

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, I intentionally didn't touch any params of the connector types, just passed the ConnectorMetricsCollector as a second param.

Connector types just passes it to the request function, so it can collect the metrics there.
Then the Actions plugin will save the data collected by it, in event log.

There shouldn't be any change or impact on the connectors, they should work as they were.

I will add a comment in Security Solution as you suggest. Thanks.

Copy link
Member

@pmuellr pmuellr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't have a chance to do a complete review, but looked at connector_metrics_collector.ts to see how we were implementing the core logic. Basic idea is sound, but I think we need to re-arrange the logic and add a try/catch on the stringify ...

};

public addRequestBodyBytes(result?: AxiosError | AxiosResponse, body: string | object = '') {
const sBody = typeof body === 'string' ? body : JSON.stringify(body);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a great feeling about depending on axios to send a string | object. Or, I'm afarid of using JSON.stringify() without a try/catch - it can throw :-). I'd be ok with using a ${body} version of the content length in case of an error.

Since this could be expensive, I think we should move this AFTER the Content-Length check, since that seems highly likely to be the common case. So just check if Content-Length is there and return that, otherwise do the other calculations (stringify, byteLength, etc).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not axios returning string | object, it's us passing the body as a fallback here, and we always pass it either as string or object. And breaking JSON.stringify is really hard, it just stringify what ever you pass :)

But you are right putting it after Content-Length check is better.
Moved it and wrapped with try-catch just in case.

@@ -345,6 +356,7 @@ describe('createActionEventLogRecordObject', () => {
],
action: {
name: 'test name',
type_id: '.slack',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this must have been the source of the question on snake- vs camel- case hahahaha

This type_id seems out of place to me. Is this the way we do it with rule id's?

Copy link
Contributor Author

@ersin-erdal ersin-erdal Jul 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No it was SO name :) Like, action or alert.
This is a field name, it is the same for the rule, just the field name there is rule_type_id... but rule.rule_type_id looks weird to me...

@elasticmachine
Copy link
Contributor

elasticmachine commented Jul 13, 2024

⏳ Build in-progress, with failures

Failed CI Steps

Test Failures

  • [job] [logs] FTR Configs #47 / apis ESQL sync error messages Using string field type: text and number field type: integer Checking error messages
  • [job] [logs] FTR Configs #47 / apis ESQL sync error messages Using string field type: text and number field type: integer Checking error messages

History

cc @ersin-erdal

Copy link
Member

@pmuellr pmuellr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thx for the changes and explanations, sorry for the delay!

@ersin-erdal ersin-erdal removed the request for review from a team August 20, 2024 10:36
@ersin-erdal ersin-erdal requested a review from a team as a code owner August 21, 2024 11:09
@ersin-erdal ersin-erdal removed the request for review from a team August 21, 2024 11:22
@kibana-ci
Copy link
Collaborator

kibana-ci commented Aug 26, 2024

💚 Build Succeeded

  • Buildkite Build
  • Commit: 23b4921
  • Kibana Serverless Image: docker.elastic.co/kibana-ci/kibana-serverless:pr-186804-23b4921310d9

Metrics [docs]

Public APIs missing comments

Total count of every public API that lacks a comment. Target amount is 0. Run node scripts/build_api_docs --plugin [yourplugin] --stats comments for more detailed information.

id before after diff
actions 301 308 +7

Public APIs missing exports

Total count of every type that is part of your API that should be exported but is not. This will cause broken links in the API documentation system. Target amount is 0. Run node scripts/build_api_docs --plugin [yourplugin] --stats exports for more detailed information.

id before after diff
actions 32 33 +1
Unknown metric groups

API count

id before after diff
actions 307 314 +7

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

cc @ersin-erdal

Copy link
Contributor

@tomsonpl tomsonpl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Defend Workflows code review 👍 and testes crowdstrike connector manually.

Thanks!

@ersin-erdal ersin-erdal merged commit 9372027 into elastic:main Aug 27, 2024
21 checks passed
@ersin-erdal ersin-erdal deleted the 209-meter-request-bytes branch August 27, 2024 09:48
@kibanamachine
Copy link
Contributor

💔 All backports failed

Status Branch Result
8.15 Backport failed because of merge conflicts

Manual backport

To create the backport manually run:

node scripts/backport --pr 186804

Questions ?

Please refer to the Backport tool documentation

Copy link
Member

@ashokaditya ashokaditya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed this while testing for alerts for SentinelOne and Elastic Defend endpoints.

Screenshot 2024-08-27 at 14 26 23

API error
Screenshot 2024-08-27 at 14 26 35

error response
Screenshot 2024-08-27 at 14 26 44

Comment on lines +205 to +206
queryParams?: SentinelOneGetActivitiesParams,
connectorUsageCollector?: ConnectorUsageCollector
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
queryParams?: SentinelOneGetActivitiesParams,
connectorUsageCollector?: ConnectorUsageCollector
queryParams: SentinelOneGetActivitiesParams | undefined,
connectorUsageCollector: ConnectorUsageCollector

params: queryParams,
responseSchema: SentinelOneGetActivitiesResponseSchema,
},
connectorUsageCollector!
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you apply the above suggestion then:

Suggested change
connectorUsageCollector!
connectorUsageCollector

@kibanamachine kibanamachine added the backport:skip This commit does not require backporting label Aug 27, 2024
ersin-erdal added a commit that referenced this pull request Oct 31, 2024
Resolves: elastic/response-ops-team#209 

This PR is a follow-on of #186804.

Creates a new task that runs every 1 hour to push the total
connector-request-body-bytes that have been saved in the event log to
usage-api.
kibanamachine pushed a commit to kibanamachine/kibana that referenced this pull request Oct 31, 2024
Resolves: elastic/response-ops-team#209

This PR is a follow-on of elastic#186804.

Creates a new task that runs every 1 hour to push the total
connector-request-body-bytes that have been saved in the event log to
usage-api.

(cherry picked from commit 216f899)
nreese pushed a commit to nreese/kibana that referenced this pull request Nov 1, 2024
Resolves: elastic/response-ops-team#209 

This PR is a follow-on of elastic#186804.

Creates a new task that runs every 1 hour to push the total
connector-request-body-bytes that have been saved in the event log to
usage-api.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
apm:review backport:skip This commit does not require backporting ci:project-deploy-observability Create an Observability project release_note:skip Skip the PR/issue when compiling release notes Team:Fleet Team label for Observability Data Collection Fleet team Team:obs-ux-infra_services Observability Infrastructure & Services User Experience Team Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) v8.16.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.