Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Expose Geo2IP Enrichment feature to other plugins to standardize OpenSearch IP enrichment offering #698

Closed
andy-k-improving opened this issue Nov 18, 2024 · 6 comments · Fixed by #700 or #706
Labels
enhancement New feature or request

Comments

@andy-k-improving
Copy link
Contributor

andy-k-improving commented Nov 18, 2024

The purpose of this RFC (request for comments) is to gather community feedbacks on a proposal to provide a way to expose the IP enrichment functionality as a registered Action on OpenSearch, which will allow other OpenSearch Plugins to leverage the enrichment, instead of reinventing the wheel.

Problem Statement

At the moment the GeoSpatial plugin do offer IP enrichment feature, however the scope is limited to Document ingestion and also within the GeoSpatial plugin itself.
This segregation will force other Plugin within the eco-system which wish to perform IP conversion to re-invent the wheel to provide the same conversion, even when GeoSpatial plugin is present in the system.
And this result in fragmentation and duplicate effort on the IP enrichment capability for OpenSearch platform as a whole.

Current State

The OpenSearch IP Enrichment under GeoSpatial plugin is only available during document ingestion, however this doesn't cover all the use case, one of the example would be the executing PPL command under the SQL plugin, which there is a need to convert given IP address into location information, such as City name, Country name....etc.
Without the expose of this functionality from GeoSpatial plugin, team from the SQL Plugin will be forced to implement similar if not identical logic, to perform the same operations that GeoSpatial currently provided (Convert the IP string into location data).
This is not desirable from OpenSearch platform perspective, as this will cause further fragmentation on the IP enrichment functionality front.

Proposal

To expose the IP to Geo data conversion into a standalone function and make it accessible via NodeClient.execute( ) during OpenSearch runtime, which allow other plugin to dynamically check and leverage this conversion rather maintaining their own conversion feature.

Approach

To register an new transportAction during the plugin bootstrap time, which is:

  1. A standlone jar module which other plugin can be imported as an interface to perform the IP address lookup.
  2. The jar module should be self contain with all the necessary classes and objects, such as ActionType, ActionRequest and ActionResult.
  3. The jar module should be decoupled with the implementation and allow user (Other plugin) to interact with the CRUD operations of the dataset and perform IP String conversion as needed.

API Design

  • Expose an newlyCreated GeoSpatialNodeClient object which responsible to take an IpEnrichmentActionRequest which is a composite object which contain the following fields:

    • String : IPAddress
    • String / Enum : DataSet provider
  • And return an IpEnrichmentActionResponse which contains:

    • Map<String, Object>: Conversion result.
@andy-k-improving andy-k-improving added the enhancement New feature or request label Nov 18, 2024
@andy-k-improving andy-k-improving changed the title (Draft) [RFC] Expose Geo2IP Enrichment feature to other plugins to standardize OpenSearch IP enrichment offering [RFC] Expose Geo2IP Enrichment feature to other plugins to standardize OpenSearch IP enrichment offering Nov 18, 2024
@heemin32
Copy link
Collaborator

heemin32 commented Nov 18, 2024

Hi @andy-k-improving, it's good to know that other plugins might need to use the geo2ip feature. For a plugin to leverage the geo2ip functionality, I believe it would require a datasource as an input parameter. Is DataSet provider datasource?

@andy-k-improving
Copy link
Contributor Author

Hi @andy-k-improving, it's good to know that other plugins might need to use the geo2ip feature. For a plugin to leverage the geo2ip functionality, I believe it would require a datasource as an input parameter. Is DataSet provider datasource?

@heemin32
Yes, that is correct, I'm proposing to have ip and the datasource as input parameter, and return Map<String, Object> directly from ip2GeoCachedDao.getGeoData( ) or similar datasource implementation. As this is just a proxy method to access GeoSpatial and I prefer not to restrict what to return as this stage.

For now I have a prototype on a remote branch and I will push that by the of the day.
Can I go ahead for the implementation, then we can discuss further over on the MR, or it's more preferable to have this thread hang for a while in order to reach more audiences?

@heemin32
Copy link
Collaborator

heemin32 commented Nov 18, 2024

Feel free to proceed with publishing the PR, and we can continue the discussion there.

Based on opensearch-project/sql#3038, it seems it might be useful to have optional field name as parameter input as well.

@andy-k-improving
Copy link
Contributor Author

Feel free to proceed with publishing the PR, and we can continue the discussion there.

Based on opensearch-project/sql#3038, it seems it might be useful to have optional field name as parameter input as well.

That make sense, will update the RFC accordingly.
Cool, then I will proceed, thanks!

@andy-k-improving
Copy link
Contributor Author

Implementation available under:
#700

@dblock dblock removed the untriaged label Dec 9, 2024
@dblock
Copy link
Member

dblock commented Dec 9, 2024

[Catch All Triage - 1, 2, 3, 4]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
3 participants