Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding support for generic re-ranker interface and opensearch ml re-ranker for improving search relavancy. #494

Merged
merged 27 commits into from
Jan 16, 2024
Merged
Changes from 1 commit
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
bdf5d9b
Add rerank processor interfaces
HenryL27 Nov 14, 2023
17cff65
add cross-encoder specific logic and factory
HenryL27 Nov 14, 2023
8d476db
add unittests
HenryL27 Nov 16, 2023
4efa463
add integration test
HenryL27 Nov 18, 2023
de96761
use string.format() instead of concatenation
HenryL27 Dec 1, 2023
6f85824
rename generateScoringContext to generateRerankingContext
HenryL27 Dec 1, 2023
a30180c
add name change in test too. whoops
HenryL27 Dec 1, 2023
b8820ec
start refactoring with contextSaourceFetchers
HenryL27 Dec 4, 2023
5e1c00b
refactor to use contextSourceFetchers to get context
HenryL27 Dec 5, 2023
2976807
rename CrossEncoder to TextSimilarity
HenryL27 Dec 5, 2023
5332fee
add query_context layer to search ext
HenryL27 Dec 5, 2023
aa1d524
add javadocs
HenryL27 Dec 5, 2023
77301d9
update to new asyncProcessResponse api
HenryL27 Dec 11, 2023
e8de412
rename reranktype to ML_OPENSEARCH
HenryL27 Dec 18, 2023
a7090b2
improve error messages for bad rerank type config
HenryL27 Dec 18, 2023
797eaf6
simplify configuration/factory logic
HenryL27 Dec 18, 2023
ddf2866
improve handling for non-flat-string context fields
HenryL27 Dec 18, 2023
14c8f89
rename TextSimilarity files to MLOpenSearch files
HenryL27 Dec 18, 2023
577f855
apply spotless after rebase
HenryL27 Dec 19, 2023
e3cf218
update changelog
HenryL27 Dec 21, 2023
7a6595f
after rebase
HenryL27 Jan 9, 2024
708fb66
Address pr comments and fix XContent in search ext
HenryL27 Jan 10, 2024
2d04075
move contextSourceFetchers to their own subdirectory
HenryL27 Jan 10, 2024
a39428b
Apply suggestions from code review
HenryL27 Jan 11, 2024
f462965
CR changes
HenryL27 Jan 11, 2024
db8bec1
finish CR comments and fix broken unittest
HenryL27 Jan 11, 2024
7962ffa
fix unittest names
HenryL27 Jan 11, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
move contextSourceFetchers to their own subdirectory
Signed-off-by: HenryL27 <[email protected]>
HenryL27 committed Jan 10, 2024
commit 2d04075d4a583f8837da45f665fff8372c9f75fd
Original file line number Diff line number Diff line change
@@ -13,11 +13,11 @@

import org.opensearch.ingest.ConfigurationUtils;
import org.opensearch.neuralsearch.ml.MLCommonsClientAccessor;
import org.opensearch.neuralsearch.processor.rerank.ContextSourceFetcher;
import org.opensearch.neuralsearch.processor.rerank.DocumentContextSourceFetcher;
import org.opensearch.neuralsearch.processor.rerank.MLOpenSearchRerankProcessor;
import org.opensearch.neuralsearch.processor.rerank.QueryContextSourceFetcher;
import org.opensearch.neuralsearch.processor.rerank.RerankType;
import org.opensearch.neuralsearch.processor.rerank.context.ContextSourceFetcher;
import org.opensearch.neuralsearch.processor.rerank.context.DocumentContextSourceFetcher;
import org.opensearch.neuralsearch.processor.rerank.context.QueryContextSourceFetcher;
import org.opensearch.search.pipeline.Processor;
import org.opensearch.search.pipeline.SearchResponseProcessor;

Original file line number Diff line number Diff line change
@@ -13,6 +13,9 @@
import org.opensearch.core.action.ActionListener;
import org.opensearch.neuralsearch.ml.MLCommonsClientAccessor;
import org.opensearch.neuralsearch.processor.factory.RerankProcessorFactory;
import org.opensearch.neuralsearch.processor.rerank.context.ContextSourceFetcher;
import org.opensearch.neuralsearch.processor.rerank.context.DocumentContextSourceFetcher;
import org.opensearch.neuralsearch.processor.rerank.context.QueryContextSourceFetcher;

/**
* Rescoring Rerank Processor that uses a TextSimilarity model in ml-commons to rescore
Original file line number Diff line number Diff line change
@@ -12,6 +12,7 @@
import org.opensearch.action.search.SearchRequest;
import org.opensearch.action.search.SearchResponse;
import org.opensearch.core.action.ActionListener;
import org.opensearch.neuralsearch.processor.rerank.context.ContextSourceFetcher;
import org.opensearch.search.pipeline.PipelineProcessingContext;
import org.opensearch.search.pipeline.SearchResponseProcessor;

Original file line number Diff line number Diff line change
@@ -13,6 +13,7 @@
import org.opensearch.action.search.SearchResponse;
import org.opensearch.action.search.SearchResponseSections;
import org.opensearch.core.action.ActionListener;
import org.opensearch.neuralsearch.processor.rerank.context.ContextSourceFetcher;
import org.opensearch.search.SearchHit;
import org.opensearch.search.SearchHits;
import org.opensearch.search.profile.SearchProfileShardResults;
Original file line number Diff line number Diff line change
@@ -2,7 +2,7 @@
* Copyright OpenSearch Contributors
* SPDX-License-Identifier: Apache-2.0
*/
package org.opensearch.neuralsearch.processor.rerank;
package org.opensearch.neuralsearch.processor.rerank.context;

import java.util.Map;

Original file line number Diff line number Diff line change
@@ -2,7 +2,7 @@
* Copyright OpenSearch Contributors
* SPDX-License-Identifier: Apache-2.0
*/
package org.opensearch.neuralsearch.processor.rerank;
package org.opensearch.neuralsearch.processor.rerank.context;

import java.util.ArrayList;
import java.util.HashMap;
Original file line number Diff line number Diff line change
@@ -2,7 +2,7 @@
* Copyright OpenSearch Contributors
* SPDX-License-Identifier: Apache-2.0
*/
package org.opensearch.neuralsearch.processor.rerank;
package org.opensearch.neuralsearch.processor.rerank.context;

import java.io.ByteArrayInputStream;
import java.io.ByteArrayOutputStream;
Original file line number Diff line number Diff line change
@@ -16,10 +16,10 @@
import org.mockito.Mock;
import org.opensearch.OpenSearchParseException;
import org.opensearch.neuralsearch.ml.MLCommonsClientAccessor;
import org.opensearch.neuralsearch.processor.rerank.DocumentContextSourceFetcher;
import org.opensearch.neuralsearch.processor.rerank.MLOpenSearchRerankProcessor;
import org.opensearch.neuralsearch.processor.rerank.RerankProcessor;
import org.opensearch.neuralsearch.processor.rerank.RerankType;
import org.opensearch.neuralsearch.processor.rerank.context.DocumentContextSourceFetcher;
import org.opensearch.search.pipeline.Processor.PipelineContext;
import org.opensearch.search.pipeline.SearchResponseProcessor;
import org.opensearch.test.OpenSearchTestCase;
Original file line number Diff line number Diff line change
@@ -36,6 +36,8 @@
import org.opensearch.core.xcontent.XContentBuilder;
import org.opensearch.neuralsearch.ml.MLCommonsClientAccessor;
import org.opensearch.neuralsearch.processor.factory.RerankProcessorFactory;
import org.opensearch.neuralsearch.processor.rerank.context.DocumentContextSourceFetcher;
import org.opensearch.neuralsearch.processor.rerank.context.QueryContextSourceFetcher;
import org.opensearch.neuralsearch.query.NeuralQueryBuilder;
import org.opensearch.neuralsearch.query.ext.RerankSearchExtBuilder;
import org.opensearch.search.SearchExtBuilder;