Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apply the date histogram rewrite optimization to range aggregation #13865

Merged
merged 42 commits into from
Jun 19, 2024
Merged
Show file tree
Hide file tree
Changes from 21 commits
Commits
Show all changes
42 commits
Select commit Hold shift + click to select a range
53cb70f
Refactor the ranges representation
bowenlan-amzn May 21, 2024
69730b1
Refactor try fast filter
bowenlan-amzn May 22, 2024
1e2d7f4
Main work finished; left the handling of different numeric data types
bowenlan-amzn May 23, 2024
95b04dd
buildRanges accepts field type
bowenlan-amzn May 28, 2024
8dd1dda
first working draft probably
bowenlan-amzn May 29, 2024
c5d2175
Merge branch 'main' into 13531-range-agg
bowenlan-amzn May 29, 2024
ed79e02
add change log
bowenlan-amzn May 29, 2024
c7043e4
accommodate geo distance agg
bowenlan-amzn May 29, 2024
90d6790
Fix test
bowenlan-amzn May 29, 2024
67c281c
Merge branch 'main' into 13531-range-agg
bowenlan-amzn May 29, 2024
c10c775
[Refactor] range is lower inclusive, right exclusive
bowenlan-amzn May 31, 2024
783b14a
Merge branch 'main' into 13531-range-agg
bowenlan-amzn Jun 2, 2024
06b3372
adding test
bowenlan-amzn Jun 5, 2024
c6b5a9c
Adding test and refactor
bowenlan-amzn Jun 5, 2024
d590081
Merge branch 'main' into 13531-range-agg
bowenlan-amzn Jun 5, 2024
58e5281
refactor
bowenlan-amzn Jun 5, 2024
37c6d84
add test
bowenlan-amzn Jun 6, 2024
e0ba84b
add test and update the compare logic in tree traversal
bowenlan-amzn Jun 6, 2024
4603ec0
fix test, add random test
bowenlan-amzn Jun 6, 2024
afbce0c
Merge branch 'main' into 13531-range-agg
bowenlan-amzn Jun 6, 2024
9359fc2
Merge branch 'main' into 13531-range-agg
bowenlan-amzn Jun 6, 2024
54bfe92
refactor to address comments
bowenlan-amzn Jun 6, 2024
6ae1a9b
Merge branch 'main' into 13531-range-agg
bowenlan-amzn Jun 7, 2024
cc92c44
Merge branch 'main' into 13531-range-agg
bowenlan-amzn Jun 7, 2024
6546736
small potential performance update
bowenlan-amzn Jun 8, 2024
328006b
fix precommit
bowenlan-amzn Jun 9, 2024
a290e1d
Merge branch 'main' into 13531-range-agg
bowenlan-amzn Jun 9, 2024
23bbcbb
Merge branch 'main' into 13531-range-agg
bowenlan-amzn Jun 10, 2024
1b586bb
Merge branch 'main' into 13531-range-agg
bowenlan-amzn Jun 10, 2024
65de090
refactor
bowenlan-amzn Jun 11, 2024
f3c07c7
refactor
bowenlan-amzn Jun 11, 2024
fc0aff5
Merge branch 'main' into 13531-range-agg
bowenlan-amzn Jun 11, 2024
bab28e6
set refresh_interval to -1
bowenlan-amzn Jun 11, 2024
e545b90
Merge branch 'main' into 13531-range-agg
bowenlan-amzn Jun 11, 2024
78b4d9d
address comment
bowenlan-amzn Jun 11, 2024
185ed4e
address comment
bowenlan-amzn Jun 12, 2024
910b66a
Merge branch 'main' into 13531-range-agg
bowenlan-amzn Jun 12, 2024
a2d50ce
address comment
bowenlan-amzn Jun 13, 2024
fe85ad3
Fix test
bowenlan-amzn Jun 13, 2024
48a03a4
Merge branch 'main' into 13531-range-agg
bowenlan-amzn Jun 18, 2024
07a5293
Merge branch 'main' into 13531-range-agg
bowenlan-amzn Jun 18, 2024
9764b23
Merge branch 'main' into 13531-range-agg
bowenlan-amzn Jun 19, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
- Add remote routing table for remote state publication with experimental feature flag ([#13304](https://github.com/opensearch-project/OpenSearch/pull/13304))
- [Remote Store] Add support to disable flush based on translog reader count ([#14027](https://github.com/opensearch-project/OpenSearch/pull/14027))
- [Query Insights] Add exporter support for top n queries ([#12982](https://github.com/opensearch-project/OpenSearch/pull/12982))
- Apply the date histogram rewrite optimization to range aggregation ([#13865](https://github.com/opensearch-project/OpenSearch/pull/13865))

### Dependencies
- Bump `com.github.spullara.mustache.java:compiler` from 0.9.10 to 0.9.13 ([#13329](https://github.com/opensearch-project/OpenSearch/pull/13329), [#13559](https://github.com/opensearch-project/OpenSearch/pull/13559))
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@
import com.fasterxml.jackson.core.JsonParseException;

import org.apache.lucene.document.Field;
import org.apache.lucene.document.LongPoint;
import org.apache.lucene.index.DocValues;
import org.apache.lucene.index.LeafReaderContext;
import org.apache.lucene.index.NumericDocValues;
Expand Down Expand Up @@ -188,6 +189,17 @@ public ScaledFloatFieldType(String name, double scalingFactor) {
this(name, true, false, true, Collections.emptyMap(), scalingFactor, null);
}

@Override
public void encodePoint(Number value, byte[] point) {
assert value instanceof Double;
LongPoint.encodeDimension((long) (scalingFactor * value.doubleValue()), point, 0);
}

@Override
public int pointNumBytes() {
return Long.BYTES;
}

public double getScalingFactor() {
return scalingFactor;
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,9 @@ setup:
date:
type: date
format: epoch_second
scaled_field:
type: scaled_float
scaling_factor: 100

- do:
cluster.health:
Expand Down Expand Up @@ -528,3 +531,131 @@ setup:
- is_false: aggregations.unsigned_long_range.buckets.2.to

- match: { aggregations.unsigned_long_range.buckets.2.doc_count: 0 }

---
"Double range profiler shows filter rewrite info":
- skip:
version: " - 2.99.99"
reason: debug info for filter rewrite added in 3.0.0 (to be backported to 2.14.0)

- do:
index:
index: test
id: 1
body: { "double" : 42 }

- do:
index:
index: test
id: 2
body: { "double" : 100 }

- do:
index:
index: test
id: 3
body: { "double" : 50 }

- do:
indices.refresh: {}

- do:
search:
index: test
body:
size: 0
profile: true
aggs:
double_range:
range:
field: double
ranges:
- to: 50
- from: 50
to: 150
- from: 150

- length: { aggregations.double_range.buckets: 3 }

- match: { aggregations.double_range.buckets.0.key: "*-50.0" }
- is_false: aggregations.double_range.buckets.0.from
- match: { aggregations.double_range.buckets.0.to: 50.0 }
- match: { aggregations.double_range.buckets.0.doc_count: 1 }
- match: { aggregations.double_range.buckets.1.key: "50.0-150.0" }
- match: { aggregations.double_range.buckets.1.from: 50.0 }
- match: { aggregations.double_range.buckets.1.to: 150.0 }
- match: { aggregations.double_range.buckets.1.doc_count: 2 }
- match: { aggregations.double_range.buckets.2.key: "150.0-*" }
- match: { aggregations.double_range.buckets.2.from: 150.0 }
- is_false: aggregations.double_range.buckets.2.to
- match: { aggregations.double_range.buckets.2.doc_count: 0 }

- match: { profile.shards.0.aggregations.0.debug.optimized_segments: 1 }
bowenlan-amzn marked this conversation as resolved.
Show resolved Hide resolved
- match: { profile.shards.0.aggregations.0.debug.unoptimized_segments: 0 }
- match: { profile.shards.0.aggregations.0.debug.leaf_visited: 1 }
- match: { profile.shards.0.aggregations.0.debug.inner_visited: 0 }

---
"Scaled Float Range Aggregation":
- do:
index:
index: test
id: 1
body: { "scaled_field": 1 }

- do:
index:
index: test
id: 2
body: { "scaled_field": 1.53 }

- do:
index:
index: test
id: 3
body: { "scaled_field": -2.1 }

- do:
index:
index: test
id: 4
body: { "scaled_field": 1.53 }

- do:
indices.refresh: { }

- do:
search:
index: test
body:
size: 0
aggs:
my_range:
range:
field: scaled_field
ranges:
- to: 0
- from: 0
to: 1
- from: 1
to: 1.5
- from: 1.5

- length: { aggregations.my_range.buckets: 4 }

- match: { aggregations.my_range.buckets.0.key: "*-0.0" }
- is_false: aggregations.my_range.buckets.0.from
- match: { aggregations.my_range.buckets.0.to: 0.0 }
- match: { aggregations.my_range.buckets.0.doc_count: 1 }
- match: { aggregations.my_range.buckets.1.key: "0.0-1.0" }
- match: { aggregations.my_range.buckets.1.from: 0.0 }
- match: { aggregations.my_range.buckets.1.to: 1.0 }
- match: { aggregations.my_range.buckets.1.doc_count: 0 }
- match: { aggregations.my_range.buckets.2.key: "1.0-1.5" }
- match: { aggregations.my_range.buckets.2.from: 1.0 }
- match: { aggregations.my_range.buckets.2.to: 1.5 }
- match: { aggregations.my_range.buckets.2.doc_count: 1 }
- match: { aggregations.my_range.buckets.3.key: "1.5-*" }
- match: { aggregations.my_range.buckets.3.from: 1.5 }
- is_false: aggregations.my_range.buckets.3.to
- match: { aggregations.my_range.buckets.3.doc_count: 2 }
Original file line number Diff line number Diff line change
Expand Up @@ -549,6 +549,16 @@ public static long parseToLong(
return resolution.convert(dateParser.parse(BytesRefs.toString(value), now, roundUp, zone));
}

@Override
public void encodePoint(Number value, byte[] point) {
LongPoint.encodeDimension(value.longValue(), point, 0);
}

@Override
public int pointNumBytes() {
return Long.BYTES;
}

@Override
public Query distanceFeatureQuery(Object origin, String pivot, float boost, QueryShardContext context) {
failIfNotIndexedAndNoDocValues();
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -536,4 +536,12 @@ public Map<String, String> meta() {
public TextSearchInfo getTextSearchInfo() {
return textSearchInfo;
}

public void encodePoint(Number value, byte[] point) {
throw new IllegalCallerException("Field [" + name() + "] of type [" + typeName() + "] does not support encoding points");
}

public int pointNumBytes() {
throw new IllegalCallerException("Field [" + name() + "] of type [" + typeName() + "] does not support encoding points");
}
bowenlan-amzn marked this conversation as resolved.
Show resolved Hide resolved
}
Loading
Loading