Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for knn_vector property type #524

Merged
merged 14 commits into from
Jun 20, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@ Inspired from [Keep a Changelog](https://keepachangelog.com/en/1.0.0/)
## [Unreleased 2.x]

### Added
- Add support for knn_vector field type ([#529](https://github.com/opensearch-project/opensearch-java/pull/524))

### Dependencies

Expand Down
198 changes: 195 additions & 3 deletions USER_GUIDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,11 +6,16 @@
- [Create a client](#create-a-client)
- [Create a client using `RestClientTransport`](#create-a-client-using-restclienttransport)
- [Create a client using `ApacheHttpClient5Transport`](#create-a-client-using-apachehttpclient5transport)
- [Create an index](#create-an-index)
- [Create an index](#create-an-index)
- [Create an index with default settings](#create-an-index-with-default-settings)
- [Create an index with custom settings and mappings](#create-an-index-with-custom-settings-and-mappings)
- [Index data](#index-data)
- [Search for the documents](#search-for-the-documents)
- [Get raw JSON results](#get-raw-json-results)
- [Search documents using a match query](#search-documents-using-a-match-query)
- [Search documents using k-NN](#search-documents-using-k-nn)
- [Exact k-NN with scoring script](#exact-k-nn-with-scoring-script)
- [Exact k-NN with painless scripting extension](#exact-k-nn-with-painless-scripting-extension)
- [Search documents using suggesters](#search-documents-using-suggesters)
- [App Data class](#app-data-class)
- [Using completion suggester](#using-completion-suggester)
Expand Down Expand Up @@ -84,7 +89,7 @@ There are multiple low level transports which `OpenSearchClient` could be config
import org.apache.hc.core5.http.HttpHost;

final HttpHost[] hosts = new HttpHost[] {
new HttpHost("localhost", "http", 9200)
new HttpHost("http", "localhost", 9200)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

};

// Initialize the client with SSL and TLS enabled
Expand Down Expand Up @@ -112,7 +117,7 @@ Upcoming OpenSearch `3.0.0` release brings HTTP/2 support and as such, the `Rest
import org.apache.hc.core5.http.HttpHost;

final HttpHost[] hosts = new HttpHost[] {
new HttpHost("localhost", "http", 9200)
new HttpHost("http", "localhost", 9200)
};

final OpenSearchTransport transport = ApacheHttpClient5TransportBuilder
Expand Down Expand Up @@ -140,12 +145,33 @@ OpenSearchClient client = new OpenSearchClient(transport);

## Create an index

### Create an index with default settings

```java
String index = "sample-index";
CreateIndexRequest createIndexRequest = new CreateIndexRequest.Builder().index(index).build();
client.indices().create(createIndexRequest);
```

### Create an index with custom settings and mappings

```java
String index = "sample-index";
IndexSettings settings = new IndexSettings.Builder()
.numberOfShards("2")
.numberOfReplicas("1")
.build();
TypeMapping mapping = new TypeMapping.Builder()
.properties("age", new Property.Builder().integer(new IntegerNumberProperty.Builder().build()).build())
.build();
CreateIndexRequest createIndexRequest = new CreateIndexRequest.Builder()
.index(index)
.settings(settings)
.mappings(mapping)
.build();
client.indices().create(createIndexRequest);
```

## Index data

```java
Expand Down Expand Up @@ -191,6 +217,172 @@ for (int i = 0; i < searchResponse.hits().hits().size(); i++) {
}
```

## Search documents using k-NN

### Exact k-NN with scoring script

1. Create index with custom mapping

```java
String index = "my-knn-index-1";
TypeMapping mapping = new TypeMapping.Builder()
.properties("my_vector", new Property.Builder()
.knnVector(new KnnVectorProperty.Builder()
.dimension(4)
.build())
.build())
.build();
CreateIndexRequest createIndexRequest = new CreateIndexRequest.Builder()
.index(index)
.mappings(mapping)
.build();
client.indices().create(createIndexRequest);
```

2. Index documents

```java
JsonObject doc1 = Json.createObjectBuilder()
.add("my_vector", Json.createArrayBuilder().add(1.5).add(5.5).add(4.5).add(6.4).build())
.add("price", 10.3)
.build();
JsonObject doc2 = Json.createObjectBuilder()
.add("my_vector", Json.createArrayBuilder().add(2.5).add(3.5).add(5.6).add(6.7).build())
.add("price", 5.5)
.build();
JsonObject doc3 = Json.createObjectBuilder()
.add("my_vector", Json.createArrayBuilder().add(4.5).add(5.5).add(6.7).add(3.7).build())
.add("price", 4.4)
.build();

ArrayList<BulkOperation> operations = new ArrayList<>();
operations.add(new BulkOperation.Builder().index(
IndexOperation.of(io -> io.index(index).id("1").document(doc1))
).build());
operations.add(new BulkOperation.Builder().index(
IndexOperation.of(io -> io.index(index).id("2").document(doc2))
).build());
operations.add(new BulkOperation.Builder().index(
IndexOperation.of(io -> io.index(index).id("3").document(doc3))
).build());

BulkRequest bulkRequest = new BulkRequest.Builder()
.index(index)
.operations(operations)
.build();
client.bulk(bulkRequest);
```

3. Search documents using k-NN script score (_This implementation utilizes `com.fasterxml.jackson.databind.JsonNode` as the target document class, which is not part of the OpenSearch Java library. However, any document class that matches the searched data can be used instead._)

```java
InlineScript inlineScript = new InlineScript.Builder()
.source("knn_score")
.lang("knn")
.params(Map.of(
"field", JsonData.of("my_vector"),
"query_value", JsonData.of(List.of(1.5, 5.5, 4.5, 6.4)),
"space_type", JsonData.of("cosinesimil")
))
.build();
Query query = new Query.Builder()
.scriptScore(new ScriptScoreQuery.Builder()
.query(new Query.Builder()
.matchAll(new MatchAllQuery.Builder().build())
.build())
.script(new Script.Builder()
.inline(inlineScript)
.build())
.build())
.build();
SearchRequest searchRequest = new SearchRequest.Builder()
.index(index)
.query(query)
.build();
SearchResponse<JsonNode> searchResponse = client.search(searchRequest, JsonNode.class);
```

### Exact k-NN with painless scripting extension

1. Create index with custom mapping

```java
String index = "my-knn-index-1";
TypeMapping mapping = new TypeMapping.Builder()
.properties("my_vector", new Property.Builder()
.knnVector(new KnnVectorProperty.Builder()
.dimension(4)
.build())
.build())
.build();
CreateIndexRequest createIndexRequest = new CreateIndexRequest.Builder()
.index(index)
.mappings(mapping)
.build();
client.indices().create(createIndexRequest);
```

2. Index documents

```java
JsonObject doc1 = Json.createObjectBuilder()
.add("my_vector", Json.createArrayBuilder().add(1.5).add(5.5).add(4.5).add(6.4).build())
.add("price", 10.3)
.build();
JsonObject doc2 = Json.createObjectBuilder()
.add("my_vector", Json.createArrayBuilder().add(2.5).add(3.5).add(5.6).add(6.7).build())
.add("price", 5.5)
.build();
JsonObject doc3 = Json.createObjectBuilder()
.add("my_vector", Json.createArrayBuilder().add(4.5).add(5.5).add(6.7).add(3.7).build())
.add("price", 4.4)
.build();

ArrayList<BulkOperation> operations = new ArrayList<>();
operations.add(new BulkOperation.Builder().index(
IndexOperation.of(io -> io.index(index).id("1").document(doc1))
).build());
operations.add(new BulkOperation.Builder().index(
IndexOperation.of(io -> io.index(index).id("2").document(doc2))
).build());
operations.add(new BulkOperation.Builder().index(
IndexOperation.of(io -> io.index(index).id("3").document(doc3))
).build());

BulkRequest bulkRequest = new BulkRequest.Builder()
.index(index)
.operations(operations)
.build();
client.bulk(bulkRequest);
```

3. Search documents using k-NN with painless scripting extension (_This implementation utilizes `com.fasterxml.jackson.databind.JsonNode` as the target document class, which is not part of the OpenSearch Java library. However, any document class that matches the searched data can be used instead._)

```java
InlineScript inlineScript = new InlineScript.Builder()
.source("1.0 + cosineSimilarity(params.query_value, doc[params.field])")
.params(Map.of(
"field", JsonData.of("my_vector"),
"query_value", JsonData.of(List.of(1.5, 5.5, 4.5, 6.4))
))
.build();
Query query = new Query.Builder()
.scriptScore(new ScriptScoreQuery.Builder()
.query(new Query.Builder()
.matchAll(new MatchAllQuery.Builder().build())
.build())
.script(new Script.Builder()
.inline(inlineScript)
.build())
.build())
.build();
SearchRequest searchRequest = new SearchRequest.Builder()
.index(index)
.query(query)
.build();
SearchResult<JsonNode> searchResult = client.search(searchRequest, JsonNode.class);
```

## Search documents using suggesters

### App Data class
Expand Down
Loading