Skip to content

Commit

Permalink
Merge remote-tracking branch 'upstream/main' into test_no_overlap_ccs…
Browse files Browse the repository at this point in the history
…_ccr
  • Loading branch information
jakelandis committed May 13, 2024
2 parents 855e2ef + e600d41 commit 6f3e362
Show file tree
Hide file tree
Showing 602 changed files with 9,968 additions and 4,175 deletions.
2 changes: 1 addition & 1 deletion .buildkite/pipelines/intake.yml
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,7 @@ steps:
timeout_in_minutes: 300
matrix:
setup:
BWC_VERSION: ["7.17.22", "8.13.4", "8.14.0", "8.15.0"]
BWC_VERSION: ["7.17.22", "8.13.5", "8.14.0", "8.15.0"]
agents:
provider: gcp
image: family/elasticsearch-ubuntu-2004
Expand Down
6 changes: 3 additions & 3 deletions .buildkite/pipelines/periodic-packaging.yml
Original file line number Diff line number Diff line change
Expand Up @@ -529,8 +529,8 @@ steps:
env:
BWC_VERSION: 8.12.2

- label: "{{matrix.image}} / 8.13.4 / packaging-tests-upgrade"
command: ./.ci/scripts/packaging-test.sh -Dbwc.checkout.align=true destructiveDistroUpgradeTest.v8.13.4
- label: "{{matrix.image}} / 8.13.5 / packaging-tests-upgrade"
command: ./.ci/scripts/packaging-test.sh -Dbwc.checkout.align=true destructiveDistroUpgradeTest.v8.13.5
timeout_in_minutes: 300
matrix:
setup:
Expand All @@ -543,7 +543,7 @@ steps:
machineType: custom-16-32768
buildDirectory: /dev/shm/bk
env:
BWC_VERSION: 8.13.4
BWC_VERSION: 8.13.5

- label: "{{matrix.image}} / 8.14.0 / packaging-tests-upgrade"
command: ./.ci/scripts/packaging-test.sh -Dbwc.checkout.align=true destructiveDistroUpgradeTest.v8.14.0
Expand Down
10 changes: 5 additions & 5 deletions .buildkite/pipelines/periodic.yml
Original file line number Diff line number Diff line change
Expand Up @@ -591,8 +591,8 @@ steps:
- signal_reason: agent_stop
limit: 3

- label: 8.13.4 / bwc
command: .ci/scripts/run-gradle.sh -Dbwc.checkout.align=true v8.13.4#bwcTest
- label: 8.13.5 / bwc
command: .ci/scripts/run-gradle.sh -Dbwc.checkout.align=true v8.13.5#bwcTest
timeout_in_minutes: 300
agents:
provider: gcp
Expand All @@ -601,7 +601,7 @@ steps:
buildDirectory: /dev/shm/bk
preemptible: true
env:
BWC_VERSION: 8.13.4
BWC_VERSION: 8.13.5
retry:
automatic:
- exit_status: "-1"
Expand Down Expand Up @@ -714,7 +714,7 @@ steps:
setup:
ES_RUNTIME_JAVA:
- openjdk17
BWC_VERSION: ["7.17.22", "8.13.4", "8.14.0", "8.15.0"]
BWC_VERSION: ["7.17.22", "8.13.5", "8.14.0", "8.15.0"]
agents:
provider: gcp
image: family/elasticsearch-ubuntu-2004
Expand Down Expand Up @@ -760,7 +760,7 @@ steps:
- openjdk17
- openjdk21
- openjdk22
BWC_VERSION: ["7.17.22", "8.13.4", "8.14.0", "8.15.0"]
BWC_VERSION: ["7.17.22", "8.13.5", "8.14.0", "8.15.0"]
agents:
provider: gcp
image: family/elasticsearch-ubuntu-2004
Expand Down
2 changes: 1 addition & 1 deletion .ci/bwcVersions
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,6 @@ BWC_VERSION:
- "8.10.4"
- "8.11.4"
- "8.12.2"
- "8.13.4"
- "8.13.5"
- "8.14.0"
- "8.15.0"
2 changes: 1 addition & 1 deletion .ci/snapshotBwcVersions
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
BWC_VERSION:
- "7.17.22"
- "8.13.4"
- "8.13.5"
- "8.14.0"
- "8.15.0"
4 changes: 4 additions & 0 deletions distribution/packages/src/deb/lintian/elasticsearch
Original file line number Diff line number Diff line change
Expand Up @@ -59,3 +59,7 @@ unknown-field License
# don't build them ourselves and the license precludes us modifying them
# to fix this.
library-not-linked-against-libc usr/share/elasticsearch/modules/x-pack-ml/platform/linux-x86_64/lib/libmkl_*.so

# shared-lib-without-dependency-information (now shared-library-lacks-prerequisites) is falsely reported for libvec.so
# which has no dependencies (not even libc) besides the symbols in the base executable.
shared-lib-without-dependency-information usr/share/elasticsearch/lib/platform/linux-x64/libvec.so
5 changes: 5 additions & 0 deletions docs/changelog/108088.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
pr: 108088
summary: Add a SIMD (AVX2) optimised vector distance function for int7 on x64
area: "Search"
type: enhancement
issues: []
5 changes: 0 additions & 5 deletions docs/changelog/108276.yaml

This file was deleted.

6 changes: 0 additions & 6 deletions docs/changelog/108280.yaml

This file was deleted.

6 changes: 0 additions & 6 deletions docs/changelog/108283.yaml

This file was deleted.

5 changes: 5 additions & 0 deletions docs/changelog/108410.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
pr: 108410
summary: GeoIP tasks should wait longer for master
area: Ingest Node
type: bug
issues: []
5 changes: 5 additions & 0 deletions docs/changelog/108452.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
pr: 108452
summary: Add the rerank task to the Elasticsearch internal inference service
area: Machine Learning
type: enhancement
issues: []
6 changes: 6 additions & 0 deletions docs/changelog/108459.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
pr: 108459
summary: Do not use global ordinals strategy if the leaf reader context cannot be
obtained
area: Machine Learning
type: bug
issues: []
6 changes: 6 additions & 0 deletions docs/changelog/108517.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
pr: 108517
summary: Forward `indexServiceSafe` exception to listener
area: Transform
type: bug
issues:
- 108418
5 changes: 5 additions & 0 deletions docs/changelog/108518.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
pr: 108518
summary: Remove leading is_ prefix from Enterprise geoip docs
area: Ingest Node
type: bug
issues: []
6 changes: 6 additions & 0 deletions docs/changelog/108521.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
pr: 108521
summary: Adding override for lintian false positive on `libvec.so`
area: "Packaging"
type: bug
issues:
- 108514
5 changes: 5 additions & 0 deletions docs/changelog/108522.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
pr: 108522
summary: Ensure we return non-negative scores when scoring scalar dot-products
area: Vector Search
type: bug
issues: []
6 changes: 6 additions & 0 deletions docs/changelog/108562.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
pr: 108562
summary: Add `internalClusterTest` for and fix leak in `ExpandSearchPhase`
area: Search
type: bug
issues:
- 108369
110 changes: 105 additions & 5 deletions docs/internal/DistributedArchitectureGuide.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,14 @@
# Distributed Area Team Internals
# Distributed Area Internals

(Summary, brief discussion of our features)
The Distributed Area contains indexing and coordination systems.

The index path stretches from the user REST command through shard routing down to each individual shard's translog and storage
engine. Reindexing is effectively reading from a source index and writing to a destination index (perhaps on different nodes).
The coordination side includes cluster coordination, shard allocation, cluster autoscaling stats, task management, and cross
cluster replication. Less obvious coordination systems include networking, the discovery plugin system, the snapshot/restore
logic, and shard recovery.

A guide to the general Elasticsearch components can be found [here](https://github.com/elastic/elasticsearch/blob/main/docs/internal/GeneralArchitectureGuide.md).

# Networking

Expand Down Expand Up @@ -237,9 +245,101 @@ works in parallel with the storage engine.)

# Autoscaling

(Reactive and proactive autoscaling. Explain that we surface recommendations, how control plane uses it.)

(Sketch / list the different deciders that we have, and then also how we use information from each to make a recommendation.)
The Autoscaling API in ES (Elasticsearch) uses cluster and node level statistics to provide a recommendation
for a cluster size to support the current cluster data and active workloads. ES Autoscaling is paired
with an ES Cloud service that periodically polls the ES elected master node for suggested cluster
changes. The cloud service will add more resources to the cluster based on Elasticsearch's recommendation.
Elasticsearch by itself cannot automatically scale.

Autoscaling recommendations are tailored for the user [based on user defined policies][], composed of data
roles (hot, frozen, etc) and [deciders][]. There's a public [webinar on autoscaling][], as well as the
public [Autoscaling APIs] docs.

Autoscaling's current implementation is based primary on storage requirements, as well as memory capacity
for ML and frozen tier. It does not yet support scaling related to search load. Paired with ES Cloud,
autoscaling only scales upward, not downward, except for ML nodes that do get scaled up _and_ down.

[based on user defined policies]: https://www.elastic.co/guide/en/elasticsearch/reference/current/xpack-autoscaling.html
[deciders]: https://www.elastic.co/guide/en/elasticsearch/reference/current/autoscaling-deciders.html
[webinar on autoscaling]: https://www.elastic.co/webinars/autoscaling-from-zero-to-production-seamlessly
[Autoscaling APIs]: https://www.elastic.co/guide/en/elasticsearch/reference/current/autoscaling-apis.html

### Plugin REST and TransportAction entrypoints

Autoscaling is a [plugin][]. All the REST APIs can be found in [autoscaling/rest/][].
`GetAutoscalingCapacityAction` is the capacity calculation operation REST endpoint, as opposed to the
other rest commands that get/set/delete the policies guiding the capacity calculation. The Transport
Actions can be found in [autoscaling/action/], where [TransportGetAutoscalingCapacityAction][] is the
entrypoint on the master node for calculating the optimal cluster resources based on the autoscaling
policies.

[plugin]: https://github.com/elastic/elasticsearch/blob/v8.13.2/x-pack/plugin/autoscaling/src/main/java/org/elasticsearch/xpack/autoscaling/Autoscaling.java#L72
[autoscaling/rest/]: https://github.com/elastic/elasticsearch/tree/v8.13.2/x-pack/plugin/autoscaling/src/main/java/org/elasticsearch/xpack/autoscaling/rest
[autoscaling/action/]: https://github.com/elastic/elasticsearch/tree/v8.13.2/x-pack/plugin/autoscaling/src/main/java/org/elasticsearch/xpack/autoscaling/action
[TransportGetAutoscalingCapacityAction]: https://github.com/elastic/elasticsearch/blob/v8.13.2/x-pack/plugin/autoscaling/src/main/java/org/elasticsearch/xpack/autoscaling/action/TransportGetAutoscalingCapacityAction.java#L82-L98

### How cluster capacity is determined

[AutoscalingMetadata][] implements [Metadata.Custom][] in order to persist autoscaling policies. Each
Decider is an implementation of [AutoscalingDeciderService][]. The [AutoscalingCalculateCapacityService][]
is responsible for running the calculation.

[TransportGetAutoscalingCapacityAction.computeCapacity] is the entry point to [AutoscalingCalculateCapacityService.calculate],
which creates a [AutoscalingDeciderResults][] for [each autoscaling policy][]. [AutoscalingDeciderResults.toXContent][] then
determines the [maximum required capacity][] to return to the caller. [AutoscalingCapacity][] is the base unit of a cluster
resources recommendation.

The `TransportGetAutoscalingCapacityAction` response is cached to prevent concurrent callers
overloading the system: the operation is expensive. `TransportGetAutoscalingCapacityAction` contains
a [CapacityResponseCache][]. `TransportGetAutoscalingCapacityAction.masterOperation`
calls [through the CapacityResponseCache][], into the `AutoscalingCalculateCapacityService`, to handle
concurrent callers.

[AutoscalingMetadata]: https://github.com/elastic/elasticsearch/blob/v8.13.2/x-pack/plugin/autoscaling/src/main/java/org/elasticsearch/xpack/autoscaling/AutoscalingMetadata.java#L38
[Metadata.Custom]: https://github.com/elastic/elasticsearch/blob/v8.13.2/server/src/main/java/org/elasticsearch/cluster/metadata/Metadata.java#L141-L145
[AutoscalingDeciderService]: https://github.com/elastic/elasticsearch/blob/v8.13.2/x-pack/plugin/autoscaling/src/main/java/org/elasticsearch/xpack/autoscaling/capacity/AutoscalingDeciderService.java#L16-L19
[AutoscalingCalculateCapacityService]: https://github.com/elastic/elasticsearch/blob/v8.13.2/x-pack/plugin/autoscaling/src/main/java/org/elasticsearch/xpack/autoscaling/capacity/AutoscalingCalculateCapacityService.java#L43

[TransportGetAutoscalingCapacityAction.computeCapacity]: https://github.com/elastic/elasticsearch/blob/v8.13.2/x-pack/plugin/autoscaling/src/main/java/org/elasticsearch/xpack/autoscaling/action/TransportGetAutoscalingCapacityAction.java#L102-L108
[AutoscalingCalculateCapacityService.calculate]: https://github.com/elastic/elasticsearch/blob/v8.13.2/x-pack/plugin/autoscaling/src/main/java/org/elasticsearch/xpack/autoscaling/capacity/AutoscalingCalculateCapacityService.java#L108-L139
[AutoscalingDeciderResults]: https://github.com/elastic/elasticsearch/blob/v8.13.2/x-pack/plugin/autoscaling/src/main/java/org/elasticsearch/xpack/autoscaling/capacity/AutoscalingDeciderResults.java#L34-L38
[each autoscaling policy]: https://github.com/elastic/elasticsearch/blob/v8.13.2/x-pack/plugin/autoscaling/src/main/java/org/elasticsearch/xpack/autoscaling/capacity/AutoscalingCalculateCapacityService.java#L124-L131
[AutoscalingDeciderResults.toXContent]: https://github.com/elastic/elasticsearch/blob/v8.13.2/x-pack/plugin/autoscaling/src/main/java/org/elasticsearch/xpack/autoscaling/capacity/AutoscalingDeciderResults.java#L78
[maximum required capacity]: https://github.com/elastic/elasticsearch/blob/v8.13.2/x-pack/plugin/autoscaling/src/main/java/org/elasticsearch/xpack/autoscaling/capacity/AutoscalingDeciderResults.java#L105-L116
[AutoscalingCapacity]: https://github.com/elastic/elasticsearch/blob/v8.13.2/x-pack/plugin/autoscaling/src/main/java/org/elasticsearch/xpack/autoscaling/capacity/AutoscalingCapacity.java#L27-L35

[CapacityResponseCache]: https://github.com/elastic/elasticsearch/blob/v8.13.2/x-pack/plugin/autoscaling/src/main/java/org/elasticsearch/xpack/autoscaling/action/TransportGetAutoscalingCapacityAction.java#L44-L47
[through the CapacityResponseCache]: https://github.com/elastic/elasticsearch/blob/v8.13.2/x-pack/plugin/autoscaling/src/main/java/org/elasticsearch/xpack/autoscaling/action/TransportGetAutoscalingCapacityAction.java#L97

### Where the data comes from

The Deciders each pull data from different sources as needed to inform their decisions. The
[DiskThresholdMonitor][] is one such data source. The Monitor runs on the master node and maintains
lists of nodes that exceed various disk size thresholds. [DiskThresholdSettings][] contains the
threshold settings with which the `DiskThresholdMonitor` runs.

[DiskThresholdMonitor]: https://github.com/elastic/elasticsearch/blob/v8.13.2/server/src/main/java/org/elasticsearch/cluster/routing/allocation/DiskThresholdMonitor.java#L53-L58
[DiskThresholdSettings]: https://github.com/elastic/elasticsearch/blob/v8.13.2/server/src/main/java/org/elasticsearch/cluster/routing/allocation/DiskThresholdSettings.java#L24-L27

### Deciders

The `ReactiveStorageDeciderService` tracks information that demonstrates storage limitations are causing
problems in the cluster. It uses [an algorithm defined here][]. Some examples are
- information from the `DiskThresholdMonitor` to find out whether nodes are exceeding their storage capacity
- number of unassigned shards that failed allocation because of insufficient storage
- the max shard size and minimum node size, and whether these can be satisfied with the existing infrastructure

[an algorithm defined here]: https://github.com/elastic/elasticsearch/blob/v8.13.2/x-pack/plugin/autoscaling/src/main/java/org/elasticsearch/xpack/autoscaling/storage/ReactiveStorageDeciderService.java#L158-L176

The `ProactiveStorageDeciderService` maintains a forecast window that [defaults to 30 minutes][]. It only
runs on data streams (ILM, rollover, etc), not regular indexes. It looks at past [index changes][] that
took place within the forecast window to [predict][] resources that will be needed shortly.

[defaults to 30 minutes]: https://github.com/elastic/elasticsearch/blob/v8.13.2/x-pack/plugin/autoscaling/src/main/java/org/elasticsearch/xpack/autoscaling/storage/ProactiveStorageDeciderService.java#L32
[index changes]: https://github.com/elastic/elasticsearch/blob/v8.13.2/x-pack/plugin/autoscaling/src/main/java/org/elasticsearch/xpack/autoscaling/storage/ProactiveStorageDeciderService.java#L79-L83
[predict]: https://github.com/elastic/elasticsearch/blob/v8.13.2/x-pack/plugin/autoscaling/src/main/java/org/elasticsearch/xpack/autoscaling/storage/ProactiveStorageDeciderService.java#L85-L95

There are several more Decider Services, implementing the `AutoscalingDeciderService` interface.

# Snapshot / Restore

Expand Down
Loading

0 comments on commit 6f3e362

Please sign in to comment.