Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allows AI assistant to get info about available indices and their fields #165952

Merged
merged 35 commits into from
Oct 20, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
4d0c552
allows AI assistant to get info about available dataviews and their f…
ppisljar Sep 7, 2023
0de2923
switch from dataviews to indices
ppisljar Sep 18, 2023
7307ef1
Update x-pack/plugins/observability_ai_assistant/public/functions/get…
ppisljar Sep 19, 2023
520752b
fixing based on review
ppisljar Sep 19, 2023
4c4bc93
updating lens function to support setting timefield
ppisljar Sep 19, 2023
c7d7899
improving prompt
ppisljar Sep 20, 2023
90b9dfd
[CI] Auto-commit changed files from 'node scripts/precommit_hook.js -…
kibanamachine Sep 19, 2023
c007d7e
improving prompt
ppisljar Sep 20, 2023
fc0f00f
improving prompt
ppisljar Sep 21, 2023
7e2b0bc
moving get_dataset_info logic to the endpoint
ppisljar Sep 26, 2023
414c8bd
cleanup prompt
ppisljar Sep 26, 2023
506e9ab
sending more compact representation to LLM
ppisljar Sep 26, 2023
5a1c707
[CI] Auto-commit changed files from 'node scripts/precommit_hook.js -…
kibanamachine Sep 26, 2023
74f0e88
[CI] Auto-commit changed files from 'node scripts/eslint --no-cache -…
kibanamachine Sep 26, 2023
2297b2b
Merge branch 'main' into aiassistant/get_dataviews
ppisljar Sep 27, 2023
7b8ac69
updating based on review
ppisljar Oct 2, 2023
5c024ab
Merge remote-tracking branch 'origin/aiassistant/get_dataviews' into …
ppisljar Oct 2, 2023
4e3a370
Merge branch 'main' into aiassistant/get_dataviews
ppisljar Oct 2, 2023
ab66649
[CI] Auto-commit changed files from 'node scripts/eslint --no-cache -…
kibanamachine Oct 2, 2023
79c964c
Merge branch 'main' into aiassistant/get_dataviews
ppisljar Oct 2, 2023
daa2b2e
adding datastreams to the list
ppisljar Oct 4, 2023
705611a
[CI] Auto-commit changed files from 'node scripts/precommit_hook.js -…
kibanamachine Oct 4, 2023
0facc03
Merge branch 'main' into aiassistant/get_dataviews
ppisljar Oct 5, 2023
bd28b97
compressing field list
ppisljar Oct 11, 2023
205fb8d
Merge branch 'main' into aiassistant/get_dataviews
ppisljar Oct 11, 2023
d17cd4b
limiting to 500 fields
ppisljar Oct 11, 2023
b198e47
[CI] Auto-commit changed files from 'node scripts/eslint --no-cache -…
kibanamachine Oct 11, 2023
727c4d4
fix file name
ppisljar Oct 11, 2023
64a324a
Merge remote-tracking branch 'origin/aiassistant/get_dataviews' into …
ppisljar Oct 11, 2023
ee6a1ae
Set overflow-wrap to anywhere to prevent overflowing
dgieselaar Oct 11, 2023
701026a
Don't compress fields & don't send types
dgieselaar Oct 14, 2023
2606cd2
Merge branch 'main' of github.com:elastic/kibana into aiassistant/get…
dgieselaar Oct 14, 2023
050e971
Merge branch 'main' into aiassistant/get_dataviews
ppisljar Oct 16, 2023
d3bf6e1
Merge branch 'main' into aiassistant/get_dataviews
ppisljar Oct 16, 2023
4364312
Merge branch 'main' into aiassistant/get_dataviews
ppisljar Oct 18, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion x-pack/plugins/observability_ai_assistant/kibana.jsonc
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,8 @@
"security",
"share",
"taskManager",
"triggersActionsUi"
"triggersActionsUi",
"dataViews"
],
"requiredBundles": ["fieldFormats", "kibanaReact", "kibanaUtils"],
"optionalPlugins": [],
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -110,7 +110,7 @@ const esqlLanguagePlugin = () => {

export function MessageText({ loading, content, onActionClick }: Props) {
const containerClassName = css`
overflow-wrap: break-word;
overflow-wrap: anywhere;
`;

const onActionClickRef = useRef(onActionClick);
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,153 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/

import { chunk, groupBy, uniq } from 'lodash';
import { CreateChatCompletionResponse } from 'openai';
import { FunctionVisibility, MessageRole, RegisterFunctionDefinition } from '../../common/types';
import type { ObservabilityAIAssistantService } from '../types';

export function registerGetDatasetInfoFunction({
service,
registerFunction,
}: {
service: ObservabilityAIAssistantService;
registerFunction: RegisterFunctionDefinition;
}) {
registerFunction(
{
name: 'get_dataset_info',
contexts: ['core'],
visibility: FunctionVisibility.System,
description: `Use this function to get information about indices/datasets available and the fields available on them.

providing empty string as index name will retrieve all indices
else list of all fields for the given index will be given. if no fields are returned this means no indices were matched by provided index pattern.
wildcards can be part of index name.`,
descriptionForUser:
ppisljar marked this conversation as resolved.
Show resolved Hide resolved
'This function allows the assistant to get information about available indices and their fields.',
parameters: {
type: 'object',
additionalProperties: false,
properties: {
index: {
type: 'string',
description:
'index pattern the user is interested in or empty string to get information about all available indices',
},
},
required: ['index'],
} as const,
},
async ({ arguments: { index }, messages, connectorId }, signal) => {
const response = await service.callApi(
'POST /internal/observability_ai_assistant/functions/get_dataset_info',
{
params: {
body: {
index,
},
},
signal,
}
);

const allFields = response.fields;

const fieldNames = uniq(allFields.map((field) => field.name));

const groupedFields = groupBy(allFields, (field) => field.name);

const relevantFields = await Promise.all(
chunk(fieldNames, 500).map(async (fieldsInChunk) => {
const chunkResponse = (await service.callApi(
'POST /internal/observability_ai_assistant/chat',
{
signal,
params: {
query: {
stream: false,
},
body: {
connectorId,
messages: [
{
'@timestamp': new Date().toISOString(),
message: {
role: MessageRole.System,
content: `You are a helpful assistant for Elastic Observability.
Your task is to create a list of field names that are relevant
to the conversation, using ONLY the list of fields and
types provided in the last user message. DO NOT UNDER ANY
CIRCUMSTANCES include fields not mentioned in this list.`,
},
},
...messages.slice(1),
{
'@timestamp': new Date().toISOString(),
message: {
role: MessageRole.User,
content: `This is the list:

${fieldsInChunk.join('\n')}`,
},
},
],
functions: [
{
name: 'fields',
description: 'The fields you consider relevant to the conversation',
parameters: {
type: 'object',
additionalProperties: false,
properties: {
fields: {
type: 'array',
additionalProperties: false,
addditionalItems: false,
items: {
type: 'string',
additionalProperties: false,
addditionalItems: false,
},
},
},
required: ['fields'],
},
},
],
functionCall: 'fields',
},
},
}
)) as CreateChatCompletionResponse;

return chunkResponse.choices[0].message?.function_call?.arguments
? (
JSON.parse(chunkResponse.choices[0].message?.function_call?.arguments) as {
fields: string[];
}
).fields
.filter((field) => fieldNames.includes(field))
.map((field) => {
const fieldDescriptors = groupedFields[field];
return `${field}:${fieldDescriptors
.map((descriptor) => descriptor.type)
.join(',')}`;
})
: [chunkResponse.choices[0].message?.content ?? ''];
})
);

return {
content: {
indices: response.indices,
fields: relevantFields.flat(),
},
};
}
);
}
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ import { registerElasticsearchFunction } from './elasticsearch';
import { registerKibanaFunction } from './kibana';
import { registerLensFunction } from './lens';
import { registerRecallFunction } from './recall';
import { registerGetDatasetInfoFunction } from './get_dataset_info';
import { registerSummarizationFunction } from './summarize';
import { registerAlertsFunction } from './alerts';
import { registerEsqlFunction } from './esql';
Expand Down Expand Up @@ -42,16 +43,16 @@ export async function registerFunctions({

let description = dedent(
`You are a helpful assistant for Elastic Observability. Your goal is to help the Elastic Observability users to quickly assess what is happening in their observed systems. You can help them visualise and analyze data, investigate their systems, perform root cause analysis or identify optimisation opportunities.

It's very important to not assume what the user is meaning. Ask them for clarification if needed.

If you are unsure about which function should be used and with what arguments, ask the user for clarification or confirmation.

In KQL, escaping happens with double quotes, not single quotes. Some characters that need escaping are: ':()\\\
/\". Always put a field value in double quotes. Best: service.name:\"opbeans-go\". Wrong: service.name:opbeans-go. This is very important!

You can use Github-flavored Markdown in your responses. If a function returns an array, consider using a Markdown table to format the response.

If multiple functions are suitable, use the most specific and easy one. E.g., when the user asks to visualise APM data, use the APM functions (if available) rather than Lens.

If a function call fails, do not execute it again with the same input. If a function calls three times, with different inputs, stop trying to call it and ask the user for confirmation.
Expand All @@ -67,8 +68,7 @@ export async function registerFunctions({
Additionally, you can use the "recall" function to retrieve relevant information from the knowledge database.`;

description += `Here are principles you MUST adhere to, in order:

- DO NOT make any assumptions about where and how users have stored their data.
- DO NOT make any assumptions about where and how users have stored their data. ALWAYS first call get_dataset_info function with empty string to get information about available indices. Once you know about available indices you MUST use this function again to get a list of available fields for specific index. If user provides an index name make sure its a valid index first before using it to retrieve the field list by calling this function with an empty string!
ppisljar marked this conversation as resolved.
Show resolved Hide resolved
`;
registerSummarizationFunction({ service, registerFunction });
registerRecallFunction({ service, registerFunction });
Expand All @@ -81,6 +81,7 @@ export async function registerFunctions({
registerEsqlFunction({ service, registerFunction });
registerKibanaFunction({ service, registerFunction, coreStart });
registerAlertsFunction({ service, registerFunction });
registerGetDatasetInfoFunction({ service, registerFunction });

registerContext({
name: 'core',
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -38,23 +38,24 @@ function Lens({
end,
lens,
dataViews,
timeField,
}: {
indexPattern: string;
xyDataLayer: XYDataLayer;
start: string;
end: string;
lens: LensPublicStart;
dataViews: DataViewsServicePublic;
timeField: string;
}) {
const formulaAsync = useAsync(() => {
return lens.stateHelperApi();
}, [lens]);

const dataViewAsync = useAsync(() => {
return dataViews.create({
id: indexPattern,
title: indexPattern,
timeFieldName: '@timestamp',
timeFieldName: timeField,
});
}, [indexPattern]);

Expand Down Expand Up @@ -199,6 +200,12 @@ export function registerLensFunction({
required: ['label', 'formula', 'format'],
},
},
timeField: {
type: 'string',
default: '@timefield',
description:
'time field to use for XY chart. Use @timefield if its available on the index.',
},
breakdown: {
type: 'object',
additionalProperties: false,
Expand Down Expand Up @@ -235,15 +242,15 @@ export function registerLensFunction({
description: 'The end of the time range, in Elasticsearch datemath',
},
},
required: ['layers', 'indexPattern', 'start', 'end'],
required: ['layers', 'indexPattern', 'start', 'end', 'timeField'],
} as const,
},
async () => {
return {
content: {},
};
},
({ arguments: { layers, indexPattern, breakdown, seriesType, start, end } }) => {
({ arguments: { layers, indexPattern, breakdown, seriesType, start, end, timeField } }) => {
const xyDataLayer = new XYDataLayer({
data: layers.map((layer) => ({
type: 'formula',
Expand All @@ -263,6 +270,8 @@ export function registerLensFunction({
},
});

if (!timeField) return;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we not default to @timestamp in this case? Have you seen the LLM respond with an empty string?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for some reason lens function gets executed multiple times sometimes, where first time (or not the last time) is missing properties. seems to be a bug somewhere else in our rendering, but i wasn't able to find it, so i added a check here to make sure we only render the chart once timefield is set.

timefield should actually never be empty as it has default value set.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for some reason lens function gets executed multiple times sometimes, where first time (or not the last time) is missing properties. seems to be a bug somewhere else in our rendering, but i wasn't able to find it, so i added a check here to make sure we only render the chart once timefield is set.

Not sure if I follow. Why is it a problem if it renders multiple times? (I assume it renders multiple times, but to the same element)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not on its own, but lens seems to be doing multiple requests then, which is a performance hit if nothing else. i think its safe to not render until we get all the required properties


return (
<Lens
indexPattern={indexPattern}
Expand All @@ -271,6 +280,7 @@ export function registerLensFunction({
end={end}
lens={pluginsStart.lens}
dataViews={pluginsStart.dataViews}
timeField={timeField}
/>
);
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -276,11 +276,84 @@ const setupKnowledgeBaseRoute = createObservabilityAIAssistantServerRoute({
},
});

const functionGetDatasetInfoRoute = createObservabilityAIAssistantServerRoute({
endpoint: 'POST /internal/observability_ai_assistant/functions/get_dataset_info',
params: t.type({
body: t.type({
index: t.string,
}),
}),
options: {
tags: ['access:ai_assistant'],
},
handler: async (
resources
): Promise<{
indices: string[];
fields: Array<{ name: string; description: string; type: string }>;
}> => {
const esClient = (await resources.context.core).elasticsearch.client.asCurrentUser;

const savedObjectsClient = (await resources.context.core).savedObjects.getClient();

const index = resources.params.body.index;

let indices: string[] = [];

try {
const body = await esClient.indices.resolveIndex({
ppisljar marked this conversation as resolved.
Show resolved Hide resolved
name: index === '' ? '*' : index,
expand_wildcards: 'open',
});
indices = [...body.indices.map((i) => i.name), ...body.data_streams.map((d) => d.name)];
} catch (e) {
indices = [];
}

if (index === '') {
return {
indices,
fields: [],
};
}

if (indices.length === 0) {
return {
indices,
fields: [],
};
}

const dataViews = await (
await resources.plugins.dataViews.start()
).dataViewsServiceFactory(savedObjectsClient, esClient);

const fields = await dataViews.getFieldsForWildcard({
pattern: index,
});

// else get all the fields for the found dataview
return {
indices: [index],
fields: fields.flatMap((field) => {
return (field.esTypes ?? [field.type]).map((type) => {
return {
name: field.name,
description: field.customLabel || '',
type,
};
});
}),
};
},
});

export const functionRoutes = {
...functionElasticsearchRoute,
...functionRecallRoute,
...functionSummariseRoute,
...setupKnowledgeBaseRoute,
...getKnowledgeBaseStatus,
...functionAlertsRoute,
...functionGetDatasetInfoRoute,
};
2 changes: 2 additions & 0 deletions x-pack/plugins/observability_ai_assistant/server/types.ts
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ import type {
TaskManagerSetupContract,
TaskManagerStartContract,
} from '@kbn/task-manager-plugin/server';
import { DataViewsServerPluginStart } from '@kbn/data-views-plugin/server';

/* eslint-disable @typescript-eslint/no-empty-interface*/
export interface ObservabilityAIAssistantPluginStart {}
Expand All @@ -32,4 +33,5 @@ export interface ObservabilityAIAssistantPluginStartDependencies {
security: SecurityPluginStart;
features: FeaturesPluginStart;
taskManager: TaskManagerStartContract;
dataViews: DataViewsServerPluginStart;
}
Loading