Skip to content

Commit

Permalink
Merge branch 'main' into es-9.0-bump
Browse files Browse the repository at this point in the history
  • Loading branch information
elasticmachine authored Sep 10, 2024
2 parents bf48b1d + 574915d commit f263fe3
Show file tree
Hide file tree
Showing 22 changed files with 335 additions and 33 deletions.
7 changes: 7 additions & 0 deletions .ci/scripts/resolve-dra-manifest.sh
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,14 @@ LATEST_VERSION=$(strip_version $LATEST_BUILD)
if [ "$LATEST_VERSION" != "$ES_VERSION" ]; then
echo "Latest build for '$ARTIFACT' is version $LATEST_VERSION but expected version $ES_VERSION." 1>&2
NEW_BRANCH=$(echo $ES_VERSION | sed -E "s/([0-9]+\.[0-9]+)\.[0-9]/\1/g")

# Temporary
if [[ "$ES_VERSION" == "8.16.0" ]]; then
NEW_BRANCH="8.x"
fi

echo "Using branch $NEW_BRANCH instead of $BRANCH." 1>&2
echo "https://artifacts-$WORKFLOW.elastic.co/$ARTIFACT/latest/$NEW_BRANCH.json"
LATEST_BUILD=$(fetch_build $WORKFLOW $ARTIFACT $NEW_BRANCH)
fi

Expand Down
173 changes: 171 additions & 2 deletions docs/reference/cluster/stats.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -1307,6 +1307,142 @@ Each repository type may also include other statistics about the repositories of
====

`ccs`::
(object) Contains information relating to <<modules-cross-cluster-search, {ccs}>> settings and activity in the cluster.
+
.Properties of `ccs`
[%collapsible%open]
=====
`_search`:::
(object) Contains the telemetry information about the <<modules-cross-cluster-search, {ccs}>> usage in the cluster.
+
.Properties of `_search`
[%collapsible%open]
======
`total`:::
(integer) The total number of {ccs} requests that have been executed by the cluster.

`success`:::
(integer) The total number of {ccs} requests that have been successfully executed by the cluster.

`skipped`:::
(integer) The total number of {ccs} requests (successful or failed) that had at least one remote cluster skipped.

`took`:::
(object) Contains statistics about the time taken to execute {ccs} requests.
+
.Properties of `took`
[%collapsible%open]
=======
`max`:::
(integer) The maximum time taken to execute a {ccs} request, in milliseconds.
`avg`:::
(integer) The median time taken to execute a {ccs} request, in milliseconds.
`p90`:::
(integer) The 90th percentile of the time taken to execute {ccs} requests, in milliseconds.
=======

`took_mrt_true`::
(object) Contains statistics about the time taken to execute {ccs} requests for which the
<<ccs-minimize-roundtrips,`ccs_minimize_roundtrips`>> setting was set to `true`.
+
.Properties of `took_mrt_true`
[%collapsible%open]
=======
`max`:::
(integer) The maximum time taken to execute a {ccs} request, in milliseconds.
`avg`:::
(integer) The median time taken to execute a {ccs} request, in milliseconds.
`p90`:::
(integer) The 90th percentile of the time taken to execute {ccs} requests, in milliseconds.
=======

`took_mrt_false`::
(object) Contains statistics about the time taken to execute {ccs} requests for which the
<<ccs-minimize-roundtrips,`ccs_minimize_roundtrips`>> setting was set to `false`.
+
.Properties of `took_mrt_false`
[%collapsible%open]
=======
`max`:::
(integer) The maximum time taken to execute a {ccs} request, in milliseconds.
`avg`:::
(integer) The median time taken to execute a {ccs} request, in milliseconds.
`p90`:::
(integer) The 90th percentile of the time taken to execute {ccs} requests, in milliseconds.
=======

`remotes_per_search_max`::
(integer) The maximum number of remote clusters that were queried in a single {ccs} request.

`remotes_per_search_avg`::
(float) The average number of remote clusters that were queried in a single {ccs} request.

`failure_reasons`::
(object) Contains statistics about the reasons for {ccs} request failures.
The keys are the failure reason names and the values are the number of requests that failed for that reason.

`features`::
(object) Contains statistics about the features used in {ccs} requests. The keys are the names of the search feature,
and the values are the number of requests that used that feature. Single request can use more than one feature
(e.g. both `async` and `wildcard`). Known features are:

* `async` - <<async-search, Async search>>

* `mrt` - <<ccs-minimize-roundtrips,`ccs_minimize_roundtrips`>> setting was set to `true`.

* `wildcard` - <<api-multi-index,Multi-target syntax>> for indices with wildcards was used in the search request.

`clients`::
(object) Contains statistics about the clients that executed {ccs} requests.
The keys are the names of the clients, and the values are the number of requests that were executed by that client.
Only known clients (such as `kibana` or `elasticsearch`) are counted.

`clusters`::
(object) Contains statistics about the clusters that were queried in {ccs} requests.
The keys are cluster names, and the values are per-cluster telemetry data.
This also includes the local cluster itself, which uses the name `(local)`.
+
.Properties of per-cluster data:
[%collapsible%open]
=======
`total`:::
(integer) The total number of successful (not skipped) {ccs} requests that were executed against this cluster.
This may include requests where partial results were returned, but not requests in which the cluster has been skipped entirely.
`skipped`:::
(integer) The total number of {ccs} requests for which this cluster was skipped.
`took`:::
(object) Contains statistics about the time taken to execute requests against this cluster.
+
.Properties of `took`
[%collapsible%open]
========
`max`:::
(integer) The maximum time taken to execute a {ccs} request, in milliseconds.

`avg`:::
(integer) The median time taken to execute a {ccs} request, in milliseconds.

`p90`:::
(integer) The 90th percentile of the time taken to execute {ccs} requests, in milliseconds.
========
=======

======
=====

[[cluster-stats-api-example]]
==== {api-examples-title}

Expand Down Expand Up @@ -1607,7 +1743,35 @@ The API returns the following response:
},
"repositories": {
...
}
},
"ccs": {
"_search": {
"total": 7,
"success": 7,
"skipped": 0,
"took": {
"max": 36,
"avg": 20,
"p90": 33
},
"took_mrt_true": {
"max": 33,
"avg": 15,
"p90": 33
},
"took_mrt_false": {
"max": 36,
"avg": 26,
"p90": 36
},
"remotes_per_search_max": 3,
"remotes_per_search_avg": 2.0,
"failure_reasons": { ... },
"features": { ... },
"clients": { ... },
"clusters": { ... }
}
}
}
--------------------------------------------------
// TESTRESPONSE[s/"plugins": \[[^\]]*\]/"plugins": $body.$_path/]
Expand All @@ -1618,10 +1782,15 @@ The API returns the following response:
// TESTRESPONSE[s/"packaging_types": \[[^\]]*\]/"packaging_types": $body.$_path/]
// TESTRESPONSE[s/"snapshots": \{[^\}]*\}/"snapshots": $body.$_path/]
// TESTRESPONSE[s/"repositories": \{[^\}]*\}/"repositories": $body.$_path/]
// TESTRESPONSE[s/"clusters": \{[^\}]*\}/"clusters": $body.$_path/]
// TESTRESPONSE[s/"features": \{[^\}]*\}/"features": $body.$_path/]
// TESTRESPONSE[s/"clients": \{[^\}]*\}/"clients": $body.$_path/]
// TESTRESPONSE[s/"failure_reasons": \{[^\}]*\}/"failure_reasons": $body.$_path/]
// TESTRESPONSE[s/"field_types": \[[^\]]*\]/"field_types": $body.$_path/]
// TESTRESPONSE[s/"runtime_field_types": \[[^\]]*\]/"runtime_field_types": $body.$_path/]
// TESTRESPONSE[s/"search": \{[^\}]*\}/"search": $body.$_path/]
// TESTRESPONSE[s/: true|false/: $body.$_path/]
// TESTRESPONSE[s/"remotes_per_search_avg": [.0-9]+/"remotes_per_search_avg": $body.$_path/]
// TESTRESPONSE[s/: (true|false)/: $body.$_path/]
// TESTRESPONSE[s/: (\-)?[0-9]+/: $body.$_path/]
// TESTRESPONSE[s/: "[^"]*"/: $body.$_path/]
// These replacements do a few things:
Expand Down
3 changes: 0 additions & 3 deletions muted-tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -154,9 +154,6 @@ tests:
issue: https://github.com/elastic/elasticsearch/issues/112471
- class: org.elasticsearch.ingest.geoip.IngestGeoIpClientYamlTestSuiteIT
issue: https://github.com/elastic/elasticsearch/issues/111497
- class: org.elasticsearch.smoketest.SmokeTestIngestWithAllDepsClientYamlTestSuiteIT
method: test {yaml=ingest/80_ingest_simulate/Test ingest simulate with reroute and mapping validation from templates}
issue: https://github.com/elastic/elasticsearch/issues/112575
- class: org.elasticsearch.script.mustache.LangMustacheClientYamlTestSuiteIT
method: test {yaml=lang_mustache/50_multi_search_template/Multi-search template with errors}
issue: https://github.com/elastic/elasticsearch/issues/112580
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -217,7 +217,9 @@ setup:
"Test ingest simulate with reroute and mapping validation from templates":

- skip:
features: headers
features:
- headers
- allowed_warnings

- requires:
cluster_features: ["simulate.mapping.validation.templates"]
Expand All @@ -241,6 +243,8 @@ setup:
- match: { acknowledged: true }

- do:
allowed_warnings:
- "index template [first-index-template] has index patterns [first-index*] matching patterns from existing older templates [global] with patterns (global => [*]); this template [first-index-template] will take precedence during new index creation"
indices.put_index_template:
name: first-index-template
body:
Expand All @@ -255,6 +259,8 @@ setup:
type: text

- do:
allowed_warnings:
- "index template [second-index-template] has index patterns [second-index*] matching patterns from existing older templates [global] with patterns (global => [*]); this template [second-index-template] will take precedence during new index creation"
indices.put_index_template:
name: second-index-template
body:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -207,6 +207,7 @@ static TransportVersion def(int id) {
public static final TransportVersion UNASSIGNED_PRIMARY_COUNT_ON_CLUSTER_HEALTH = def(8_737_00_0);
public static final TransportVersion ESQL_AGGREGATE_EXEC_TRACKS_INTERMEDIATE_ATTRS = def(8_738_00_0);

public static final TransportVersion CCS_TELEMETRY_STATS = def(8_739_00_0);
/*
* STOP! READ THIS FIRST! No, really,
* ____ _____ ___ ____ _ ____ _____ _ ____ _____ _ _ ___ ____ _____ ___ ____ ____ _____ _
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -277,7 +277,7 @@ public int hashCode() {
*/
public void add(CCSTelemetrySnapshot stats) {
// This should be called in ClusterStatsResponse ctor only, so we don't need to worry about concurrency
if (stats.totalCount == 0) {
if (stats == null || stats.totalCount == 0) {
// Just ignore the empty stats.
// This could happen if the node is brand new or if the stats are not available, e.g. because it runs an old version.
return;
Expand Down Expand Up @@ -315,7 +315,7 @@ public void add(CCSTelemetrySnapshot stats) {
* "p90": 2570
* }
*/
public static void publishLatency(XContentBuilder builder, String name, LongMetricValue took) throws IOException {
private static void publishLatency(XContentBuilder builder, String name, LongMetricValue took) throws IOException {
builder.startObject(name);
{
builder.field("max", took.max());
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -175,7 +175,7 @@ public static class PerClusterCCSTelemetry {
// The number of successful (not skipped) requests to this cluster.
private final LongAdder count;
private final LongAdder skippedCount;
// This is only over the successful requetss, skipped ones do not count here.
// This is only over the successful requests, skipped ones do not count here.
private final LongMetric took;

PerClusterCCSTelemetry(String clusterAlias) {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@ public class ClusterStatsNodeResponse extends BaseNodeResponse {
private final ClusterHealthStatus clusterStatus;
private final SearchUsageStats searchUsageStats;
private final RepositoryUsageStats repositoryUsageStats;
private final CCSTelemetrySnapshot ccsMetrics;

public ClusterStatsNodeResponse(StreamInput in) throws IOException {
super(in);
Expand All @@ -47,6 +48,11 @@ public ClusterStatsNodeResponse(StreamInput in) throws IOException {
} else {
repositoryUsageStats = RepositoryUsageStats.EMPTY;
}
if (in.getTransportVersion().onOrAfter(TransportVersions.CCS_TELEMETRY_STATS)) {
ccsMetrics = new CCSTelemetrySnapshot(in);
} else {
ccsMetrics = new CCSTelemetrySnapshot();
}
}

public ClusterStatsNodeResponse(
Expand All @@ -56,7 +62,8 @@ public ClusterStatsNodeResponse(
NodeStats nodeStats,
ShardStats[] shardsStats,
SearchUsageStats searchUsageStats,
RepositoryUsageStats repositoryUsageStats
RepositoryUsageStats repositoryUsageStats,
CCSTelemetrySnapshot ccsTelemetrySnapshot
) {
super(node);
this.nodeInfo = nodeInfo;
Expand All @@ -65,6 +72,7 @@ public ClusterStatsNodeResponse(
this.clusterStatus = clusterStatus;
this.searchUsageStats = Objects.requireNonNull(searchUsageStats);
this.repositoryUsageStats = Objects.requireNonNull(repositoryUsageStats);
this.ccsMetrics = ccsTelemetrySnapshot;
}

public NodeInfo nodeInfo() {
Expand Down Expand Up @@ -95,6 +103,10 @@ public RepositoryUsageStats repositoryUsageStats() {
return repositoryUsageStats;
}

public CCSTelemetrySnapshot getCcsMetrics() {
return ccsMetrics;
}

@Override
public void writeTo(StreamOutput out) throws IOException {
super.writeTo(out);
Expand All @@ -108,5 +120,9 @@ public void writeTo(StreamOutput out) throws IOException {
if (out.getTransportVersion().onOrAfter(TransportVersions.REPOSITORIES_TELEMETRY)) {
repositoryUsageStats.writeTo(out);
} // else just drop these stats, ok for bwc
if (out.getTransportVersion().onOrAfter(TransportVersions.CCS_TELEMETRY_STATS)) {
ccsMetrics.writeTo(out);
}
}

}
Loading

0 comments on commit f263fe3

Please sign in to comment.