[PROPOSAL] Pair TransportService / ActionListener initialization to allow shutdown/restart hot plugging #174

dbwiddis · 2022-10-10T17:25:02Z

What/Why

What are you proposing?

Establish the capability to:

Add a new extension without restarting OpenSearch
Remove an active extension (and its dependencies) without restarting OpenSearch
Auto-reboot extensions to handle transport failures (or upon request by a user)

What users have asked for this feature?

In PR #172 (based on #171), we pulled test code out of the ExtensionsRunner. This exposed a flaw with our current "initialize and leave it running forever" setup and the need to include TransportService.stop() somewhere in the API.

What problems are you trying to solve?

While adding a stop/start method is easy enough, the bigger question is, when will that be called? Testing as in #172 is only one use case. But we can also integrate this with our longer term goal of hot plugging extensions.

As part of that effort we are dealing with the sequencing of initializing Extensions, and knowing when they are initialized. See issues #65 and #94, and possible usage in #149 and #151.

During some of my REST handler testing, I've ended up with a bug causing Transport to fail, with the only way to fix it restarting the extension (and restarting OpenSearch). And we want a future where we do not have to restart OpenSearch and can dynamically add or remove extensions on-the fly.

What is the developer experience going to be?

To add a new extension on the fly, start it up and then make a REST request to connect it: PUT /_extensions/add/uniqueID
To remove a new extension on the fly, start it up and then make a REST request to connect it: PUT /_extensions/remove/uniqueID
To (attempt to) reconnect to a problematic extension which executes the above two steps and sends a signal to the extension to restart itself: PUT /_extensions/restart/uniqueID

Are there any security considerations?

Extensions should require any shutdown/reboot requests come from the OpenSearch instance they already are (or have been) communicating with.

What is the user experience going to be?

A user can spin up a new extension on a new node (or same node) and issue a single REST request to activate it (and its dependencies).

A user can remove an extension, or restart it to upgrade to a new software version.

Why should it be built? Any reason not to?

There may be other ways to handle this functionality but connecting REST handlers to the transport service start/stop seems a great way to enable this functionality.

What will it take to execute?

We'll need to create a design/sequence diagram that clearly identifies the complete set of initialization steps and what is required to "uninitialize".

Any remaining open questions?

Dependencies are a big unknown here. This proposal primarily deals with extensions without dependencies.

The text was updated successfully, but these errors were encountered:

peterzhuamazon · 2022-11-10T22:17:56Z

If this file exists on the OS cluster, would it also track which plugin version it can support?
Or wise versa?

Thanks.

dbwiddis · 2022-11-10T22:24:04Z

If this file exists on the OS cluster, would it also track which plugin version it can support? Or wise versa?

I think it goes the other way around: an extension may require OpenSearch version X (or greater) which defines the SDK and capabilities that it has integrated.

I think the concept is that a file would establish an initial/baseline configuration but that configuration could be updated via REST requests (including adding/removing extensions). The JSON for the request would include the necessary settings such as version compatibility (for example, this plugin requires 2.4+, so could be a version range like [2.4.0,3.0.0).)

dbwiddis · 2022-11-10T22:27:57Z

@peterzhuamazon also see #65 which directly addresses what goes into the yml file. This issue discusses updating that on the fly but #65 would be the baseline and the necessary fields to set.

peterzhuamazon · 2022-11-10T23:33:29Z

@peterzhuamazon also see #65 which directly addresses what goes into the yml file. This issue discusses updating that on the fly but #65 would be the baseline and the necessary fields to set.

Thanks @dbwiddis for the explanation I will keep my eyes on that issue.

dbwiddis · 2023-01-24T05:38:29Z

Closing this and tracking in #356

dbwiddis mentioned this issue Oct 10, 2022

Refactor ExtensionsRunner test code out of production source tree #172

Merged

dbwiddis mentioned this issue Nov 14, 2022

Make NamedXContent available to extensions #244

Merged

dbwiddis mentioned this issue Jan 24, 2023

[META] Add ability to add/update/remove an extension without restarting OpenSearch #356

Open

4 tasks

dbwiddis closed this as completed Jan 24, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[PROPOSAL] Pair TransportService / ActionListener initialization to allow shutdown/restart hot plugging #174

[PROPOSAL] Pair TransportService / ActionListener initialization to allow shutdown/restart hot plugging #174

dbwiddis commented Oct 10, 2022

peterzhuamazon commented Nov 10, 2022

dbwiddis commented Nov 10, 2022

dbwiddis commented Nov 10, 2022

peterzhuamazon commented Nov 10, 2022

dbwiddis commented Jan 24, 2023

[PROPOSAL] Pair TransportService / ActionListener initialization to allow shutdown/restart hot plugging #174

[PROPOSAL] Pair TransportService / ActionListener initialization to allow shutdown/restart hot plugging #174

Comments

dbwiddis commented Oct 10, 2022

What/Why

What are you proposing?

What users have asked for this feature?

What problems are you trying to solve?

What is the developer experience going to be?

Are there any security considerations?

What is the user experience going to be?

Why should it be built? Any reason not to?

What will it take to execute?

Any remaining open questions?

peterzhuamazon commented Nov 10, 2022

dbwiddis commented Nov 10, 2022

dbwiddis commented Nov 10, 2022

peterzhuamazon commented Nov 10, 2022

dbwiddis commented Jan 24, 2023