Skip to content

Commit

Permalink
Fix the "highlight.max_analyzer_offset" request parameter with "plain…
Browse files Browse the repository at this point in the history
…" highlighter (#10919)

(cherry picked from commit 5d327a2)
  • Loading branch information
drunken-monkey authored and andrross committed Feb 17, 2024
1 parent de138ff commit 7d08018
Show file tree
Hide file tree
Showing 3 changed files with 30 additions and 3 deletions.
5 changes: 3 additions & 2 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,10 +24,11 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
### Removed

### Fixed
- Add a system property to configure YamlParser codepoint limits ([#12298](https://github.com/opensearch-project/OpenSearch/pull/12298))
- Prevent read beyond slice boundary in ByteArrayIndexInput ([#10481](https://github.com/opensearch-project/OpenSearch/issues/10481))
- [Revert] [Bug] Check phase name before SearchRequestOperationsListener onPhaseStart ([#12035](https://github.com/opensearch-project/OpenSearch/pull/12035))
- Add support of special WrappingSearchAsyncActionPhase so the onPhaseStart() will always be followed by onPhaseEnd() within AbstractSearchAsyncAction ([#12293](https://github.com/opensearch-project/OpenSearch/pull/12293))
- Add a system property to configure YamlParser codepoint limits ([#12298](https://github.com/opensearch-project/OpenSearch/pull/12298))
- Prevent read beyond slice boundary in ByteArrayIndexInput ([#10481](https://github.com/opensearch-project/OpenSearch/issues/10481))
- Fix the "highlight.max_analyzer_offset" request parameter with "plain" highlighter ([#10919](https://github.com/opensearch-project/OpenSearch/pull/10919))

### Security

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -78,3 +78,15 @@ setup:
index: test1
body: {"query" : {"match" : {"field2" : "fox"}}, "highlight" : {"type" : "plain", "fields" : {"field2" : {}}}}
- match: { error.root_cause.0.type: "illegal_argument_exception" }

---
"Plain highlighter on a field WITHOUT OFFSETS using max_analyzer_offset should SUCCEED":
- skip:
version: " - 2.1.99"
reason: only starting supporting the parameter max_analyzer_offset on version 2.2
- do:
search:
rest_total_hits_as_int: true
index: test1
body: {"query" : {"match" : {"field1" : "quick"}}, "highlight" : {"type" : "plain", "fields" : {"field1" : {"max_analyzer_offset": 10}}}}
- match: {hits.hits.0.highlight.field1.0: "The <em>quick</em> "}
Original file line number Diff line number Diff line change
Expand Up @@ -123,13 +123,27 @@ public HighlightField highlight(FieldHighlightContext fieldContext) throws IOExc
List<Object> textsToHighlight;
Analyzer analyzer = context.mapperService().documentMapper().mappers().indexAnalyzer();
final int maxAnalyzedOffset = context.getIndexSettings().getHighlightMaxAnalyzedOffset();
final Integer fieldMaxAnalyzedOffset = field.fieldOptions().maxAnalyzerOffset();
if (fieldMaxAnalyzedOffset != null && fieldMaxAnalyzedOffset > maxAnalyzedOffset) {
throw new IllegalArgumentException(
"max_analyzer_offset has exceeded ["
+ maxAnalyzedOffset
+ "] - maximum allowed to be analyzed for highlighting. "
+ "This maximum can be set by changing the ["
+ IndexSettings.MAX_ANALYZED_OFFSET_SETTING.getKey()
+ "] index level setting. "
+ "For large texts, indexing with offsets or term vectors is recommended!"
);
}

textsToHighlight = HighlightUtils.loadFieldValues(fieldType, context.getQueryShardContext(), hitContext, fieldContext.forceSource);

for (Object textToHighlight : textsToHighlight) {
String text = convertFieldValue(fieldType, textToHighlight);
int textLength = text.length();
if (textLength > maxAnalyzedOffset) {
if (fieldMaxAnalyzedOffset != null && textLength > fieldMaxAnalyzedOffset) {
text = text.substring(0, fieldMaxAnalyzedOffset);
} else if (textLength > maxAnalyzedOffset) {
throw new IllegalArgumentException(
"The length of ["
+ fieldContext.fieldName
Expand Down

0 comments on commit 7d08018

Please sign in to comment.