Skip to content

Commit

Permalink
[infra] Shorten IDs for ML jobs (#168234)
Browse files Browse the repository at this point in the history
Closes #47477

### Summary

ML job IDs have a limit of 64 characters. For the log ML jobs we add the
string `kibana-logs-ui` plus the space and log view IDs as a prefix to
the job names (`log-entry-rate` and `log-entry-categories-count`) which
can quickly eat up the 64 character limit (even our own Stack Monitoring
log view hits the limit). This prevents users from being able to create
ML jobs and it's hard to rename a space or log view, and the limit is
not hinted at during space creation (because they are unrelated in some
sense).

In order to achieve a more stable length to the ID, this PR introduces a
new format for the prefix which creates a UUID v5 which uses the space
and log view ID as seed information (it then removes the dashes to still
be within the size limit for the categorization job).

Since there is no technical difference between the new and old format,
this PR makes an effort to continue to support the old format and allow
migration of old jobs as needed. The old jobs work and may contain
important data so the user should not feel forced to migrate.

The main addition is a new small API that checks if any ML jobs are
available and which format they use for the ID so that the app can
request data accordingly and the APIs have been modified to take the ID
format into account (except during creation which should always use the
new format).

The solution applied is not ideal. It simply passes the ID format along
with the space and log view ID to each point where the ID is re-created
(which is multiple). The ideal solution would be to store the job data
in the store and pass that around instead but that seemed like a
considerably larger effort. This PR does introduce some functional tests
around the ML job creation process, so such a future refactor should be
a bit safer than previously.

### How to test

* Start from `main`
* Start Elasticsearch
* Start Kibana
* Load the Sample web logs (Kibana home -> Try sample data -> Other
sample data sets)
* Visit the Anomalies page in the Logs UI
* Set up any of the two ML jobs or both, wait for some results to show
up
* Checkout the PR branch
* Visit the anomalies page and verify that it still works (requests go
to resolve the ID format, should return 'legacy' which should then load
the data for the legacy job)
* Recreate the ML job and verify that the new job works and results
still show up (new requests should go out with the new format being
used, which may be a mixed mode if you have two jobs and only migrate
one of them)

---------

Co-authored-by: kibanamachine <[email protected]>
  • Loading branch information
miltonhultgren and kibanamachine authored Nov 19, 2023
1 parent 37bf74b commit 4963e6b
Show file tree
Hide file tree
Showing 71 changed files with 1,434 additions and 156 deletions.
1 change: 1 addition & 0 deletions x-pack/plugins/infra/common/http_api/latest.ts
Original file line number Diff line number Diff line change
Expand Up @@ -10,3 +10,4 @@ export * from './log_alerts/v1';
export * from './log_analysis/results/v1';
export * from './log_analysis/validation/v1';
export * from './metrics_explorer_views/v1';
export * from './log_analysis/id_formats/v1/id_formats';
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/

import * as rt from 'io-ts';
import { logEntryRateJobTypeRT, logEntryCategoriesJobTypeRT } from '../../../../log_analysis';

export const idFormatRT = rt.union([rt.literal('legacy'), rt.literal('hashed')]);
export type IdFormat = rt.TypeOf<typeof idFormatRT>;

const jobTypeRT = rt.union([logEntryRateJobTypeRT, logEntryCategoriesJobTypeRT]);
export type JobType = rt.TypeOf<typeof jobTypeRT>;

export const idFormatByJobTypeRT = rt.record(jobTypeRT, idFormatRT);
export type IdFormatByJobType = rt.TypeOf<typeof idFormatByJobTypeRT>;

export const LOG_ANALYSIS_GET_ID_FORMATS = '/api/infra/log_analysis/id_formats';

export const getLogAnalysisIdFormatsRequestPayloadRT = rt.type({
data: rt.type({
logViewId: rt.string,
spaceId: rt.string,
}),
});

export type GetLogAnalysisIdFormatsRequestPayload = rt.TypeOf<
typeof getLogAnalysisIdFormatsRequestPayloadRT
>;

export const getLogAnalysisIdFormatsSuccessResponsePayloadRT = rt.type({
data: rt.record(rt.union([logEntryRateJobTypeRT, logEntryCategoriesJobTypeRT]), idFormatRT),
});

export type GetLogAnalysisIdFormatsSuccessResponsePayload = rt.TypeOf<
typeof getLogAnalysisIdFormatsSuccessResponsePayloadRT
>;
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
import * as rt from 'io-ts';

import { persistedLogViewReferenceRT } from '@kbn/logs-shared-plugin/common';
import { idFormatByJobTypeRT } from '../../id_formats/v1/id_formats';
import { timeRangeRT, routeTimingMetadataRT } from '../../../shared';
import {
logEntryAnomalyRT,
Expand Down Expand Up @@ -54,6 +55,7 @@ export const getLogEntryAnomaliesRequestPayloadRT = rt.type({
rt.type({
// log view
logView: persistedLogViewReferenceRT,
idFormats: idFormatByJobTypeRT,
// the time range to fetch the log entry anomalies from
timeRange: timeRangeRT,
}),
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ import {
timeRangeRT,
routeTimingMetadataRT,
} from '../../../shared';
import { idFormatByJobTypeRT } from '../../id_formats/v1/id_formats';

export const LOG_ANALYSIS_GET_LOG_ENTRY_ANOMALIES_DATASETS_PATH =
'/api/infra/log_analysis/results/log_entry_anomalies_datasets';
Expand All @@ -26,6 +27,7 @@ export const getLogEntryAnomaliesDatasetsRequestPayloadRT = rt.type({
data: rt.type({
// log view
logView: persistedLogViewReferenceRT,
idFormats: idFormatByJobTypeRT,
// the time range to fetch the anomalies datasets from
timeRange: timeRangeRT,
}),
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
import * as rt from 'io-ts';

import { persistedLogViewReferenceRT } from '@kbn/logs-shared-plugin/common';
import { idFormatRT } from '../../id_formats/v1/id_formats';
import {
badRequestErrorRT,
forbiddenErrorRT,
Expand Down Expand Up @@ -41,6 +42,7 @@ export const getLogEntryCategoriesRequestPayloadRT = rt.type({
categoryCount: rt.number,
// log view
logView: persistedLogViewReferenceRT,
idFormat: idFormatRT,
// the time range to fetch the categories from
timeRange: timeRangeRT,
// a list of histograms to create
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
import * as rt from 'io-ts';

import { persistedLogViewReferenceRT } from '@kbn/logs-shared-plugin/common';
import { idFormatRT } from '../../id_formats/v1/id_formats';
import {
badRequestErrorRT,
forbiddenErrorRT,
Expand All @@ -25,6 +26,7 @@ export const getLogEntryCategoryDatasetsRequestPayloadRT = rt.type({
data: rt.type({
// log view
logView: persistedLogViewReferenceRT,
idFormat: idFormatRT,
// the time range to fetch the category datasets from
timeRange: timeRangeRT,
}),
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@

import { logEntryContextRT, persistedLogViewReferenceRT } from '@kbn/logs-shared-plugin/common';
import * as rt from 'io-ts';
import { idFormatRT } from '../../id_formats/v1/id_formats';
import {
badRequestErrorRT,
forbiddenErrorRT,
Expand All @@ -29,6 +30,7 @@ export const getLogEntryCategoryExamplesRequestPayloadRT = rt.type({
exampleCount: rt.number,
// log view
logView: persistedLogViewReferenceRT,
idFormat: idFormatRT,
// the time range to fetch the category examples from
timeRange: timeRangeRT,
}),
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@

import * as rt from 'io-ts';
import { persistedLogViewReferenceRT } from '@kbn/logs-shared-plugin/common';
import { idFormatRT } from '../../id_formats/v1/id_formats';
import { logEntryExampleRT } from '../../../../log_analysis';
import {
badRequestErrorRT,
Expand All @@ -31,6 +32,7 @@ export const getLogEntryExamplesRequestPayloadRT = rt.type({
exampleCount: rt.number,
// logView
logView: persistedLogViewReferenceRT,
idFormat: idFormatRT,
// the time range to fetch the log rate examples from
timeRange: timeRangeRT,
}),
Expand Down
32 changes: 26 additions & 6 deletions x-pack/plugins/infra/common/log_analysis/job_parameters.ts
Original file line number Diff line number Diff line change
Expand Up @@ -6,21 +6,41 @@
*/

import * as rt from 'io-ts';
import { v5 } from 'uuid';
import { IdFormat, JobType } from '../http_api/latest';

export const bucketSpan = 900000;

export const categoriesMessageField = 'message';

export const partitionField = 'event.dataset';

export const getJobIdPrefix = (spaceId: string, sourceId: string) =>
`kibana-logs-ui-${spaceId}-${sourceId}-`;
const ID_NAMESPACE = 'f91b78c0-fdd3-425d-a4ba-4c028fe57e0f';

export const getJobId = (spaceId: string, logViewId: string, jobType: string) =>
`${getJobIdPrefix(spaceId, logViewId)}${jobType}`;
export const getJobIdPrefix = (spaceId: string, sourceId: string, idFormat: IdFormat) => {
if (idFormat === 'legacy') {
return `kibana-logs-ui-${spaceId}-${sourceId}-`;
} else {
// A UUID is 36 characters but based on the ML job names for logs, our limit is 32 characters
// Thus we remove the 4 dashes
const uuid = v5(`${spaceId}-${sourceId}`, ID_NAMESPACE).replaceAll('-', '');
return `logs-${uuid}-`;
}
};

export const getDatafeedId = (spaceId: string, logViewId: string, jobType: string) =>
`datafeed-${getJobId(spaceId, logViewId, jobType)}`;
export const getJobId = (
spaceId: string,
logViewId: string,
idFormat: IdFormat,
jobType: JobType
) => `${getJobIdPrefix(spaceId, logViewId, idFormat)}${jobType}`;

export const getDatafeedId = (
spaceId: string,
logViewId: string,
idFormat: IdFormat,
jobType: JobType
) => `datafeed-${getJobId(spaceId, logViewId, idFormat, jobType)}`;

export const datasetFilterRT = rt.union([
rt.strict({
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,16 +8,16 @@
import * as rt from 'io-ts';
import { sortRT } from './log_analysis_results';

export const logEntryCategoriesJobTypeRT = rt.keyof({
'log-entry-categories-count': null,
});
export const logEntryCategoriesJobTypeRT = rt.literal('log-entry-categories-count');

export type LogEntryCategoriesJobType = rt.TypeOf<typeof logEntryCategoriesJobTypeRT>;

export const logEntryCategoriesJobTypes: LogEntryCategoriesJobType[] = [
'log-entry-categories-count',
];

export const logEntryCategoriesJobType: LogEntryCategoriesJobType = 'log-entry-categories-count';

export const logEntryCategoryDatasetRT = rt.type({
name: rt.string,
maximumAnomalyScore: rt.number,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,10 +7,9 @@

import * as rt from 'io-ts';

export const logEntryRateJobTypeRT = rt.keyof({
'log-entry-rate': null,
});
export const logEntryRateJobTypeRT = rt.literal('log-entry-rate');

export type LogEntryRateJobType = rt.TypeOf<typeof logEntryRateJobTypeRT>;

export const logEntryRateJobTypes: LogEntryRateJobType[] = ['log-entry-rate'];
export const logEntryRateJobType: LogEntryRateJobType = 'log-entry-rate';
export const logEntryRateJobTypes: LogEntryRateJobType[] = [logEntryRateJobType];
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@

import { EuiFlexGroup, EuiFlexItem } from '@elastic/eui';
import React, { useCallback } from 'react';
import { logEntryCategoriesJobType, logEntryRateJobType } from '../../../../../common/log_analysis';
import { useLogAnalysisCapabilitiesContext } from '../../../../containers/logs/log_analysis';
import {
logEntryCategoriesModule,
Expand Down Expand Up @@ -40,7 +41,7 @@ export const LogAnalysisModuleList: React.FC<{
<EuiFlexGroup>
<EuiFlexItem>
<LogAnalysisModuleListCard
jobId={logEntryRateJobIds['log-entry-rate']}
jobId={logEntryRateJobIds[logEntryRateJobType]}
hasSetupCapabilities={hasLogAnalysisSetupCapabilities}
moduleDescription={logEntryRateModule.moduleDescription}
moduleName={logEntryRateModule.moduleName}
Expand All @@ -50,7 +51,7 @@ export const LogAnalysisModuleList: React.FC<{
</EuiFlexItem>
<EuiFlexItem>
<LogAnalysisModuleListCard
jobId={logEntryCategoriesJobIds['log-entry-categories-count']}
jobId={logEntryCategoriesJobIds[logEntryCategoriesJobType]}
hasSetupCapabilities={hasLogAnalysisSetupCapabilities}
moduleDescription={logEntryCategoriesModule.moduleDescription}
moduleName={logEntryCategoriesModule.moduleName}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,12 @@ export const LogAnalysisSetupFlyout: React.FC<{
}

return (
<EuiFlyout aria-labelledby={FLYOUT_HEADING_ID} maxWidth={800} onClose={closeFlyout}>
<EuiFlyout
aria-labelledby={FLYOUT_HEADING_ID}
maxWidth={800}
onClose={closeFlyout}
data-test-subj="infraLogAnalysisSetupFlyout"
>
<EuiFlyoutHeader hasBorder>
<EuiTitle>
<h2 id={FLYOUT_HEADING_ID}>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,28 +8,30 @@
import * as rt from 'io-ts';
import type { HttpHandler } from '@kbn/core/public';

import { IdFormat, JobType } from '../../../../../common/http_api/latest';
import { getDatafeedId, getJobId } from '../../../../../common/log_analysis';
import { decodeOrThrow } from '../../../../../common/runtime_types';

interface DeleteJobsRequestArgs<JobType extends string> {
interface DeleteJobsRequestArgs<T extends JobType> {
spaceId: string;
logViewId: string;
jobTypes: JobType[];
idFormat: IdFormat;
jobTypes: T[];
}

export const callDeleteJobs = async <JobType extends string>(
requestArgs: DeleteJobsRequestArgs<JobType>,
export const callDeleteJobs = async <T extends JobType>(
requestArgs: DeleteJobsRequestArgs<T>,
fetch: HttpHandler
) => {
const { spaceId, logViewId, jobTypes } = requestArgs;
const { spaceId, logViewId, idFormat, jobTypes } = requestArgs;

// NOTE: Deleting the jobs via this API will delete the datafeeds at the same time
const deleteJobsResponse = await fetch('/internal/ml/jobs/delete_jobs', {
method: 'POST',
version: '1',
body: JSON.stringify(
deleteJobsRequestPayloadRT.encode({
jobIds: jobTypes.map((jobType) => getJobId(spaceId, logViewId, jobType)),
jobIds: jobTypes.map((jobType) => getJobId(spaceId, logViewId, idFormat, jobType)),
})
),
});
Expand All @@ -45,25 +47,28 @@ export const callGetJobDeletionTasks = async (fetch: HttpHandler) => {
return decodeOrThrow(getJobDeletionTasksResponsePayloadRT)(jobDeletionTasksResponse);
};

interface StopDatafeedsRequestArgs<JobType extends string> {
interface StopDatafeedsRequestArgs<T extends JobType> {
spaceId: string;
logViewId: string;
jobTypes: JobType[];
idFormat: IdFormat;
jobTypes: T[];
}

export const callStopDatafeeds = async <JobType extends string>(
requestArgs: StopDatafeedsRequestArgs<JobType>,
export const callStopDatafeeds = async <T extends JobType>(
requestArgs: StopDatafeedsRequestArgs<T>,
fetch: HttpHandler
) => {
const { spaceId, logViewId, jobTypes } = requestArgs;
const { spaceId, logViewId, idFormat, jobTypes } = requestArgs;

// Stop datafeed due to https://github.com/elastic/kibana/issues/44652
const stopDatafeedResponse = await fetch('/internal/ml/jobs/stop_datafeeds', {
method: 'POST',
version: '1',
body: JSON.stringify(
stopDatafeedsRequestPayloadRT.encode({
datafeedIds: jobTypes.map((jobType) => getDatafeedId(spaceId, logViewId, jobType)),
datafeedIds: jobTypes.map((jobType) =>
getDatafeedId(spaceId, logViewId, idFormat, jobType)
),
})
),
});
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,26 +8,28 @@
import * as rt from 'io-ts';
import type { HttpHandler } from '@kbn/core/public';

import { IdFormat, JobType } from '../../../../../common/http_api/latest';
import { getJobId, jobCustomSettingsRT } from '../../../../../common/log_analysis';
import { decodeOrThrow } from '../../../../../common/runtime_types';

interface RequestArgs<JobType extends string> {
interface RequestArgs<T extends JobType> {
spaceId: string;
logViewId: string;
jobTypes: JobType[];
idFormat: IdFormat;
jobTypes: T[];
}

export const callJobsSummaryAPI = async <JobType extends string>(
requestArgs: RequestArgs<JobType>,
export const callJobsSummaryAPI = async <T extends JobType>(
requestArgs: RequestArgs<T>,
fetch: HttpHandler
) => {
const { spaceId, logViewId, jobTypes } = requestArgs;
const { spaceId, logViewId, idFormat, jobTypes } = requestArgs;
const response = await fetch('/internal/ml/jobs/jobs_summary', {
method: 'POST',
version: '1',
body: JSON.stringify(
fetchJobStatusRequestPayloadRT.encode({
jobIds: jobTypes.map((jobType) => getJobId(spaceId, logViewId, jobType)),
jobIds: jobTypes.map((jobType) => getJobId(spaceId, logViewId, idFormat, jobType)),
})
),
});
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ export const callSetupMlModuleAPI = async (requestArgs: RequestArgs, fetch: Http
start,
end,
indexPatternName: indexPattern,
prefix: getJobIdPrefix(spaceId, sourceId),
prefix: getJobIdPrefix(spaceId, sourceId, 'hashed'),
startDatafeed: true,
jobOverrides,
datafeedOverrides,
Expand Down
Loading

0 comments on commit 4963e6b

Please sign in to comment.