-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[filebeat] Elasticsearch state storage for httpjson and cel inputs #41446
base: main
Are you sure you want to change the base?
Conversation
This pull request does not have a backport label.
To fixup this pull request, you need to add the backport labels for the needed
|
|
// Injecting the ApiKey that has enough permissions to write to the index | ||
// TODO: need to figure out how add permissions for the state index | ||
// agentless-state-<input id>, for example httpjson-okta.system-028ecf4b-babe-44c6-939e-9e3096af6959 | ||
apiKey := os.Getenv("AGENTLESS_ELASTICSEARCH_APIKEY") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will collaborate with agentless team on addressing this part
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When running under Elastic agent, every change of the output configuration results in a restart of the Beat process, in case that simplifies anything here for you.
@belimawr @cmacknz (or whoever wants/have time to be involved)
|
@leehinman I'd appreciate a review here to make sure this can co-exist with Beats receivers in agent since that would be the long term way we plan to run agentless inputs. |
|
||
// TODO: REMOVE THIS HACK BEFORE MERGE. LEAVING FOR TESTING FOR DRAFT | ||
// Injecting the ApiKey that has enough permissions to write to the index | ||
// TODO: need to figure out how add permissions for the state index |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fleet knows when something is an agentless package and that is probably what would hook into this to generate the key.
We could add a new state storage section to an agent policy (agent.storage
?) that Fleet knows how to template when this happens.
Agent could then send it down as another output unit with a new type (or we could define a new type of unit but that is even more work).
This would allow the key to update on the fly through Fleet and control protocol.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could also possibly be handled in the agentless api / controller and hidden from Fleet if we just inject it in as an env var. No opposition to that either really.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could also possibly be handled in the agentless api / controller and hidden from Fleet if we just inject it in as an env var. No opposition to that either really.
I brought this up during the meeting today as an option. IMHO it's just one thing to manage, might be cleaner if all in one place in the policy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple of details we need to think about with respect to these keys is what the process should be for rotating and/or revoking them.
filebeat/features/features.go
Outdated
|
||
// List of input types Elasticsearch state store is enabled for | ||
var esTypesEnabled = map[string]void{ | ||
"httpjson": {}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be configuration instead of in the code, maybe another env var?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure can do. Something like this?
AGENTLESS_ELASTICSEARCH_STATE_STORE_INPUT_TYPES=httpjson,cel
} | ||
|
||
func (s *store) get(key string, to interface{}) error { | ||
status, data, err := s.cli.Request("GET", fmt.Sprintf("/%s/%s/%s", s.index, docType, url.QueryEscape(key)), "", nil, nil) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These requests should all be tied to a context.
Also, they probably need some minimum amount of retries.
The biggest design difference with ES is now the requests can fail. A file on disk doesn't give us 429 errors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At a very high level, it feels like the way we deal with this is:
- Don't start or allow the input to progress until it has successfully initialized the state at least once to avoid massively duplicating data.
- Writes are asynchronous from the caller's perspective and the latest state is continuously retried.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These requests should all be tied to a context.
Looks like the current implementation of the client uses the context
beats/libbeat/esleg/eslegclient/connection.go
Line 406 in 249d0dc
req, err := http.NewRequestWithContext(conn.reqsContext, method, url, body) |
that is set when the client is constructed
beats/libbeat/esleg/eslegclient/connection.go
Line 262 in 249d0dc
for _, client := range clients { |
beats/libbeat/esleg/eslegclient/connection.go
Line 286 in 249d0dc
conn.reqsContext = ctx |
To simplify the PR, is there any simplification in pulling this part out and/or just always delaying the store initialization when run under Elastic agent? |
For the rest of the PR, I think reviewing this would be easier if we had a design doc that addressed the following questions:
|
Still reviewing, but I wanted to point out that this won't work at all for a beat receiver. For a beat receiver the output (in the beat configuration part) will always be |
Yes an explicit storage extension in Beats itself would make this much easier to do. Unfortunately we don't have that. |
It would. But I was more thinking that we could modify the signature of NewBeatReceiver, so we could pass in a storage extension and store it in the beat.Info like we do for the LogConsumer. The filebeat Run function would then have access to this, so if it was present it could use it. This would make the state store more like logging and the consumer, where configuration is handled at the otel level. |
Added example Now no input types are enabled by default for Elasticsearch state storage.
Switching this PR from draft. |
Pinging @elastic/sec-deployment-and-devices (Team:Security-Deployment and Devices) |
Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane) |
Proposed commit message
[filebeat] Elasticsearch state storage for httpjson input
This is a POC for Elasticsearch as State Store Backend for Security Integrations for Agentless solution.
The scope of this change was narrowed down to supporting only
httpjson
inputs in order to support Okta integration for the initial release. All the other integrations inputs still use the file storage as before.This is a short term solution for the state storage for k8s environment.
This is the first cut and the details can change depending on the feedback.
Current feature currently could be enabled
AGENTLESS_ELASTICSEARCH_STATE_STORE_ENABLED
, to be decided how this would be configurable in k8s.This change currently contains the hacky approach to the
AGENTLESS_ELASTICSEARCH_APIKEY
overwrite. This allows to the user to provide the ApiKey with elevated permissions that are required in order to be able to create/write/read the state index per input. THIS IS FOR DEVELOPMENT/TESTING ONLY. REMOVE BEFORE THE MERGE.The existing code relied on the inputs state storage to be fully configurable before the main beat managers runs. The change delays the configuration of
httpjson
input to the time when the actual configuration is received from the Agent.There is an assumption that the index template for the state storage indices is already in place before the storage is used
Example of the state storage index content for Okta integration:
The naming convention for all state store is
agentless-state-<input id>
, since the expectation for agentless we would have only one agent per policy and the agents are ephemeral.Currently in order to run the agent with Elasticsearch state storage a couple of environment variables would be required:
where the ApiKey in the
DEPENDENCIES / TODOS:
Checklist
CHANGELOG.next.asciidoc
orCHANGELOG-developer.next.asciidoc
.Disruptive User Impact
The change should have no impact, and without the feature enabled the filebeat should work as before using the file system storage for the state.