Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Global HTTP Agent for Kibana to ES communication #99736

Closed
kobelb opened this issue May 10, 2021 · 14 comments
Closed

Global HTTP Agent for Kibana to ES communication #99736

kobelb opened this issue May 10, 2021 · 14 comments
Assignees
Labels
bug Fixes for quality problems that affect the customer experience Feature:elasticsearch Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc

Comments

@kobelb
Copy link
Contributor

kobelb commented May 10, 2021

When investigating Kibana's performance for Fleet, I noticed that the legacy and the new Elasticsearch clients each handled the HTTP agent a bit differently and they each had their own HTTP agent.

As a result, we ended up with at least two separate pools of available sockets: one pool for the legacy client and one for the new client. When Kibana's communication to Elasticsearch occurs over TLS, there's a rather significant overhead to establishing these sockets, so the more ofter that we can reuse an already established socket over have to establish a new socket, the better.

I started on #79667 to address this issue, but I haven't found the time to push it across the line...

@kobelb kobelb added bug Fixes for quality problems that affect the customer experience Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc labels May 10, 2021
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-core (Team:Core)

@mshustov
Copy link
Contributor

mshustov commented May 11, 2021

When Kibana's communication to Elasticsearch occurs over TLS, there's a rather significant overhead to establishing these sockets, so the more ofter that we can reuse an already established socket over have to establish a new socket, the better.

I suppose, the migration from the legacy client address this problem? Maybe we should focus on #83910 ? The migration also addresses performance and o11y concerns.

@kobelb
Copy link
Contributor Author

kobelb commented May 11, 2021

I suppose, the migration from the legacy client address this problem? Maybe we should focus on #83910 ? The migration also addresses performance and o11y concerns.

It might; however, it'd be great to fix this issue before 8.0... To be honest, I've forgotten the details of this. I recall either the legacy or the new client not reusing the same HTTP Agent in all scenarios also.

@mshustov
Copy link
Contributor

It might; however, it'd be great to fix this issue before 8.0

We are planning to remove the legacy ES client by v7.16. Let's keep the issue open to investigate the new ES client reusing the same HTTP agent.

@mshustov
Copy link
Contributor

@kobelb can we close the issue as there is no legacy client in the Kibana codebase anymore.

@kobelb
Copy link
Contributor Author

kobelb commented Dec 15, 2021

@mshustov I'm still seeing multiple HTTP Agents being constructed. Whenever we instantiate a new instance of @elastic/elasticsearch::Client a new HTTP Agent is being created. The following are the code-paths I've found that do so:

this.asInternalUser = configureClient(config, { logger, type, getExecutionContext });

this.rootScopedClient = configureClient(config, {
logger,
type,
getExecutionContext,
scoped: true,
});

const cluster = createClient('monitoring', {
...(isMonitoringCluster ? elasticsearchConfig : {}),
plugins: [monitoringBulk, monitoringEndpointDisableWatches],
} as ESClusterConfig);

We're definitely better off than we were before, but the problem isn't entirely resolved.

@pgayvallet
Copy link
Contributor

I'm still seeing multiple HTTP Agents being constructed

Is this really an issue given these agents are sharing the same connection pool?

@kobelb
Copy link
Contributor Author

kobelb commented Dec 16, 2021

Is this really an issue given these agents are sharing the same connection pool?

They are? It's my understanding that each HTTP Agent is a connection pool. So, if we have multiple HTTP Agents we have multiple connection pools. Granted there are 3 instances of the HTTP Agent I saw being created for communication with Elastisearch, so the situation isn't abysmal, but it could be improved.

@pgayvallet
Copy link
Contributor

It's my understanding that each HTTP Agent is a connection pool

There are distinct ConnectionPool and agent options when creating a client. However, you're probably right in the sense that different non-child instances of a Client probably instantiate their own ConnectionPool.

cc @delvedor could you help us here?

  • Can you confirm that clients created via the parent.child() API are sharing the same connection pool as their parents and siblings (as long as the ConnectionPool option is not provided when calling .child that is)?
  • Can you confirm that clients created directly via call of new Client(...) are not?

If the second point is confirmed, what would be the easiest option in your opinion to have all of our clients share the same connection pool? A custom ConnectionPool class? Having all the clients share a common parent? Something else?

@mshustov
Copy link
Contributor

mshustov commented Dec 17, 2021

Can you confirm that clients created via the parent.child() API are sharing the same connection pool as their parents and siblings (as long as the ConnectionPool option is not provided when calling .child that is)?

From the docs: the client offers a child API, which returns a new client instance that shares the connection pool with the parent client. https://www.elastic.co/guide/en/elasticsearch/client/javascript-api/master/child.html
and code: https://github.com/elastic/elasticsearch-js/blob/main/src/client.ts#L214

Can you confirm that clients created directly via call of new Client(...) are not?

They don't. https://github.com/elastic/elasticsearch-js/blob/main/src/client.ts#L220

If the second point is confirmed, what would be the easiest option in your opinion to have all of our clients share the same connection pool?

Should we? Different clients can connect to different hosts, thus sockets won't be reused. Monitoring plugin can connect to a different ES cluster, for example.

const cluster = createClient('monitoring', {
...(isMonitoringCluster ? elasticsearchConfig : {}),
plugins: [monitoringBulk, monitoringEndpointDisableWatches],
} as ESClusterConfig);

We can share a connection pool for the Core clients though (by using .child API, for example)
this.asInternalUser = configureClient(config, { logger, type, getExecutionContext });
this.rootScopedClient = configureClient(config, {

@delvedor
Copy link
Member

delvedor commented Dec 17, 2021

Everything that @mshustov said.
Also, want to clarify the naming, when you say "connection pool", you mean the client class that is used for keeping track of all the connected nodes. Every node (which is part of the client of the connection pool) then contains an agent instance, which has its own pool of sockets connected to the specific node.

As result, you will see an agent instance per each connection created.
For example, if kibana is connecting to a cluster of 5 nodes, and it's creating 3 separate clients (without the .child api), then you will have 15 http agents instances. While if you have a single parent client that you use to create the 3 "top level" clients, then you will have 5 http agents instances.

It's also worth mentioning that the internal http agent uses a lifo queue, which means a that it will create as many sockets as needed if there ids a burst of request, but will try to kill them as soon as the burst finishes. (see the scheduling option).

@kobelb
Copy link
Contributor Author

kobelb commented Dec 17, 2021

Should we? Different clients can connect to different hosts, thus sockets won't be reused. Monitoring plugin can connect to a different ES cluster, for example.

It's a good question. By default, all outbound HTTP requests that are made using the http library will use a singleton and global instance of the HTTP Agent. Based on this, we know that we can use an HTTP Agent connection pool for multiple hosts.

However, if the user has configured the monitoring ES hosts to be different than the normal ES hosts, it doesn't make much sense to reuse the same sockets.

@afharo
Copy link
Member

afharo commented Jul 4, 2024

@gsoldevila, I think you already implemented this, right? Can we close this issue?

@gsoldevila
Copy link
Contributor

gsoldevila commented Jul 5, 2024

Yes, even from monitoring we are calling @kbn/core-elasticsearch-server's createClusterClient().
In PR #137748 we introduced an agent manager to have the same Agent across all Client instances:

https://github.com/elastic/kibana/blob/main/packages/core/elasticsearch/core-elasticsearch-server-internal/src/elasticsearch_service.ts#L222,L234

AFAIK each Agent has one socket pool per origin, that's why it accepts a maxSockets (per origin) and also maxTotalSockets (global, all origins), so I think we're good to close this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Fixes for quality problems that affect the customer experience Feature:elasticsearch Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc
Projects
None yet
Development

No branches or pull requests

7 participants