Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[APM] Improve logic to determine when to use the transaction.duration.summary field #171315

Merged
merged 34 commits into from
Jan 16, 2024
Merged
Show file tree
Hide file tree
Changes from 32 commits
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
0de5219
Improve logic to determine when to use the transaction.duration.summa…
crespocarlos Nov 15, 2023
6a53f6d
Merge branch 'main' of github.com:elastic/kibana into 167578-empty-la…
crespocarlos Nov 15, 2023
c7f7683
Simplifying the solution
crespocarlos Nov 16, 2023
ec4177d
Small optimization
crespocarlos Nov 20, 2023
9a1d36f
Add test case
crespocarlos Nov 20, 2023
83e1c55
Merge branch 'main' into 167578-empty-latency-chart-fix
kibanamachine Nov 20, 2023
0e329c9
Fix after merge
crespocarlos Nov 20, 2023
0351b2f
Clean up
crespocarlos Nov 20, 2023
618cddc
CR fixes
crespocarlos Nov 23, 2023
0416fe9
Merge branch 'main' into 167578-empty-latency-chart-fix
kibanamachine Nov 23, 2023
72782f3
Verify when hasDurationSummaryField is true on a cluster level
crespocarlos Nov 23, 2023
bb2d2cd
Merge branch 'main' into 167578-empty-latency-chart-fix
kibanamachine Nov 27, 2023
a109347
Merge branch 'main' into 167578-empty-latency-chart-fix
kibanamachine Dec 1, 2023
72f76b0
Merge branch 'main' into 167578-empty-latency-chart-fix
kibanamachine Dec 12, 2023
ea01c51
Type fix
crespocarlos Dec 12, 2023
cf8a8f7
Merge branch 'main' into 167578-empty-latency-chart-fix
kibanamachine Dec 13, 2023
28a82ee
CR fixes
crespocarlos Dec 13, 2023
3c89c0f
Revert changes to base_client
crespocarlos Dec 13, 2023
1710bf6
Revert changes to base_client
crespocarlos Dec 13, 2023
aa2e343
Clean up
crespocarlos Dec 13, 2023
7a80eb7
More clean up
crespocarlos Dec 13, 2023
196724c
Merge branch 'main' into 167578-empty-latency-chart-fix
kibanamachine Dec 18, 2023
c5addd4
Merge branch 'main' into 167578-empty-latency-chart-fix
kibanamachine Dec 20, 2023
f89d228
CR fixes
crespocarlos Dec 21, 2023
0709537
Merge branch 'main' into 167578-empty-latency-chart-fix
kibanamachine Jan 2, 2024
0ee3106
[CI] Auto-commit changed files from 'node scripts/notice'
kibanamachine Jan 2, 2024
22d3e61
Refactor get_document_sources
crespocarlos Jan 3, 2024
ed45132
Small improvement
crespocarlos Jan 3, 2024
6b3f0bf
Clean up
crespocarlos Jan 3, 2024
33efa72
Merge branch 'main' into 167578-empty-latency-chart-fix
kibanamachine Jan 4, 2024
6d5eed5
Merge branch 'main' into 167578-empty-latency-chart-fix
kibanamachine Jan 8, 2024
f207b54
Merge branch 'main' into 167578-empty-latency-chart-fix
kibanamachine Jan 15, 2024
5897ee7
Rename variable
crespocarlos Jan 15, 2024
ecfa42c
Fix types
crespocarlos Jan 15, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0 and the Server Side Public License, v 1; you may not use this file except
* in compliance with, at your election, the Elastic License 2.0 or the Server
* Side Public License, v 1.
*/

import { ApmFields, apm } from '@kbn/apm-synthtrace-client';
import { random } from 'lodash';
import { pipeline, Readable } from 'stream';
import semver from 'semver';
import { Scenario } from '../cli/scenario';
import {
addObserverVersionTransform,
deleteSummaryFieldTransform,
} from '../lib/utils/transform_helpers';
import { withClient } from '../lib/utils/with_client';

const scenario: Scenario<ApmFields> = async ({ logger, versionOverride }) => {
const version = versionOverride as string;
const isLegacy = versionOverride && semver.lt(version, '8.7.0');
return {
bootstrap: async ({ apmEsClient }) => {
if (isLegacy) {
apmEsClient.pipeline((base: Readable) => {
const defaultPipeline = apmEsClient.getDefaultPipeline()(
base
) as unknown as NodeJS.ReadableStream;

return pipeline(
defaultPipeline,
addObserverVersionTransform(version),
deleteSummaryFieldTransform(),
(err) => {
if (err) {
logger.error(err);
}
}
);
});
}
},
generate: ({ range, clients: { apmEsClient } }) => {
const successfulTimestamps = range.ratePerMinute(6);
const instance = apm
.service({
name: `java${isLegacy ? '-legacy' : ''}`,
environment: 'production',
agentName: 'java',
})
.instance(`instance`);

return withClient(
apmEsClient,
successfulTimestamps.generator((timestamp) => {
const randomHigh = random(1000, 4000);
const randomLow = random(100, randomHigh / 5);
const duration = random(randomLow, randomHigh);
return instance
.transaction({ transactionName: 'GET /order/{id}' })
.timestamp(timestamp)
.duration(duration)
.success();
})
);
},
};
};

export default scenario;
263 changes: 138 additions & 125 deletions x-pack/plugins/apm/server/lib/helpers/get_document_sources.ts
Original file line number Diff line number Diff line change
Expand Up @@ -10,8 +10,22 @@ import { ApmDocumentType } from '../../../common/document_type';
import { RollupInterval } from '../../../common/rollup';
import { APMEventClient } from './create_es_client/create_apm_event_client';
import { getConfigForDocumentType } from './create_es_client/document_type';
import { TRANSACTION_DURATION_SUMMARY } from '../../../common/es_fields/apm';
import { TimeRangeMetadata } from '../../../common/time_range_metadata';
import { getDurationLegacyFilter } from './transactions';

const QUERY_INDEX = {
BEFORE: 0,
CURRENT: 1,
DURATION_SUMMARY: 2,
} as const;

interface DocumentTypeData {
documentType: ApmDocumentType;
rollupInterval: RollupInterval;
hasDocBefore: boolean;
hasDocAfter: boolean;
hasDurationSummary: boolean;
}

const getRequest = ({
documentType,
Expand Down Expand Up @@ -64,170 +78,169 @@ export async function getDocumentSources({
kuery: string;
enableServiceTransactionMetrics: boolean;
enableContinuousRollups: boolean;
}) {
const currentRange = rangeQuery(start, end);
const diff = end - start;
const kql = kqlQuery(kuery);
const beforeRange = rangeQuery(start - diff, end - diff);

const sourcesToCheck = [
}): Promise<TimeRangeMetadata['sources']> {
const documentTypesToCheck = [
...(enableServiceTransactionMetrics
? [ApmDocumentType.ServiceTransactionMetric as const]
: []),
ApmDocumentType.TransactionMetric as const,
].flatMap((documentType) => {
const docTypeConfig = getConfigForDocumentType(documentType);

return (
enableContinuousRollups
? docTypeConfig.rollupIntervals
: [RollupInterval.OneMinute]
).flatMap((rollupInterval) => {
return {
documentType,
rollupInterval,
meta: {
checkSummaryFieldExists: false,
},
before: getRequest({
documentType,
rollupInterval,
filters: [...kql, ...beforeRange],
}),
current: getRequest({
documentType,
rollupInterval,
filters: [...kql, ...currentRange],
}),
};
});
];

const documentTypesInfo = await getDocumentTypesInfo({
apmEventClient,
start,
end,
kuery,
enableContinuousRollups,
documentTypesToCheck,
});

const sourcesToCheckWithSummary = [
ApmDocumentType.TransactionMetric as const,
].flatMap((documentType) => {
const docTypeConfig = getConfigForDocumentType(documentType);

return (
enableContinuousRollups
? docTypeConfig.rollupIntervals
: [RollupInterval.OneMinute]
).flatMap((rollupInterval) => {
const summaryExistsFilter = {
bool: {
filter: [
{
exists: {
field: TRANSACTION_DURATION_SUMMARY,
},
},
],
},
};
const hasAnySourceDocBefore = documentTypesInfo.some(
(source) => source.hasDocBefore
);

return {
documentType,
rollupInterval,
meta: {
checkSummaryFieldExists: true,
},
before: getRequest({
documentType,
rollupInterval,
filters: [...kql, ...beforeRange, summaryExistsFilter],
}),
current: getRequest({
documentType,
rollupInterval,
filters: [...kql, ...currentRange, summaryExistsFilter],
}),
};
});
return [
...mapToSources(documentTypesInfo, hasAnySourceDocBefore),
{
documentType: ApmDocumentType.TransactionEvent,
rollupInterval: RollupInterval.None,
hasDocs: true,
hasDurationSummaryField: false,
},
];
}

const getDocumentTypesInfo = async ({
apmEventClient,
start,
end,
kuery,
enableContinuousRollups,
documentTypesToCheck,
}: {
apmEventClient: APMEventClient;
start: number;
end: number;
kuery: string;
enableContinuousRollups: boolean;
documentTypesToCheck: ApmDocumentType[];
}) => {
const getRequests = getDocumentTypeRequestsFn({
enableContinuousRollups,
start,
end,
kuery,
});

const allSourcesToCheck = [...sourcesToCheck, ...sourcesToCheckWithSummary];
const sourceRequests = documentTypesToCheck.flatMap(getRequests);

const allSearches = allSourcesToCheck.flatMap(({ before, current }) => [
before,
current,
]);
const allSearches = sourceRequests
.flatMap(({ before, current, durationSummaryCheck }) => [
before,
current,
durationSummaryCheck,
])
.filter(
(request): request is ReturnType<typeof getRequest> =>
request !== undefined
);

const allResponses = (
await apmEventClient.msearch('get_document_availability', ...allSearches)
).responses;

const checkedSources = allSourcesToCheck.map((source, index) => {
const { documentType, rollupInterval } = source;
const responseBefore = allResponses[index * 2];
const responseAfter = allResponses[index * 2 + 1];

const hasDocBefore = responseBefore.hits.total.value > 0;
const hasDocAfter = responseAfter.hits.total.value > 0;
return sourceRequests.map(({ documentType, rollupInterval, ...queries }) => {
const numberOfQueries = Object.values(queries).filter(Boolean).length;
// allResponses is sorted by the order of the requests in sourceRequests
const docTypeResponses = allResponses.splice(0, numberOfQueries);

return {
documentType,
rollupInterval,
hasDocBefore,
hasDocAfter,
checkSummaryFieldExists: source.meta.checkSummaryFieldExists,
hasDocBefore: docTypeResponses[QUERY_INDEX.BEFORE].hits.total.value > 0,
hasDocAfter: docTypeResponses[QUERY_INDEX.CURRENT].hits.total.value > 0,
hasDurationSummary: docTypeResponses[QUERY_INDEX.DURATION_SUMMARY]
Copy link
Member

@dgieselaar dgieselaar Jan 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just one small nit here: maybe we should rename to allHaveDurationSummary?

? docTypeResponses[QUERY_INDEX.DURATION_SUMMARY].hits.total.value === 0
: true,
};
});
};

const hasAnySourceDocBefore = checkedSources.some(
(source) => source.hasDocBefore
);
const getDocumentTypeRequestsFn =
({
enableContinuousRollups,
start,
end,
kuery,
}: {
enableContinuousRollups: boolean;
start: number;
end: number;
kuery: string;
}) =>
(documentType: ApmDocumentType) => {
const currentRange = rangeQuery(start, end);
const diff = end - start;
const kql = kqlQuery(kuery);
const beforeRange = rangeQuery(start - diff, end - diff);

const rollupIntervals = enableContinuousRollups
? getConfigForDocumentType(documentType).rollupIntervals
: [RollupInterval.OneMinute];

return rollupIntervals.map((rollupInterval) => ({
documentType,
rollupInterval,
before: getRequest({
documentType,
rollupInterval,
filters: [...kql, ...beforeRange],
}),
current: getRequest({
documentType,
rollupInterval,
filters: [...kql, ...currentRange],
}),
...(documentType !== ApmDocumentType.ServiceTransactionMetric
? {
durationSummaryCheck: getRequest({
documentType,
rollupInterval,
filters: [...kql, ...currentRange, getDurationLegacyFilter()],
}),
}
: {}),
}));
Comment on lines +204 to +213
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we not check both before/after here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need to - but I might be wrong. The hasDoc case seems to be different, at least based on the comment provided. For the durationSummary I think we're only interested if in the current timestamp all documents have the new field.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're right! Because the check is inversed (which is not possible with document sources), checking the current time range is enough.

};

const sourcesWithHasDocs = checkedSources.map((checkedSource) => {
const mapToSources = (
sources: DocumentTypeData[],
hasAnySourceDocBefore: boolean
) => {
return sources.map((source) => {
const {
documentType,
hasDocAfter,
hasDocBefore,
rollupInterval,
checkSummaryFieldExists,
} = checkedSource;
hasDurationSummary,
} = source;

const hasDocBeforeOrAfter = hasDocBefore || hasDocAfter;

// If there is any data before, we require that data is available before
// this time range to mark this source as available. If we don't do that,
// users that upgrade to a version that starts generating service tx metrics
// will see a mostly empty screen for a while after upgrading.
// If we only check before, users with a new deployment will use raw transaction
// events.
const hasDocs = hasAnySourceDocBefore ? hasDocBefore : hasDocBeforeOrAfter;

return {
documentType,
rollupInterval,
checkSummaryFieldExists,
hasDocs,
hasDurationSummaryField: hasDurationSummary,
};
});

const sources: TimeRangeMetadata['sources'] = sourcesWithHasDocs
.filter((source) => !source.checkSummaryFieldExists)
.map((checkedSource) => {
const { documentType, hasDocs, rollupInterval } = checkedSource;
return {
documentType,
rollupInterval,
hasDocs,
hasDurationSummaryField:
documentType === ApmDocumentType.ServiceTransactionMetric ||
Boolean(
sourcesWithHasDocs.find((eSource) => {
return (
eSource.documentType === documentType &&
eSource.rollupInterval === rollupInterval &&
eSource.checkSummaryFieldExists
);
})?.hasDocs
),
};
});

return sources.concat({
documentType: ApmDocumentType.TransactionEvent,
rollupInterval: RollupInterval.None,
hasDocs: true,
hasDurationSummaryField: false,
});
}
};
Loading