Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: filter invalid queries for natural language search #623

Merged
merged 5 commits into from
Sep 25, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 0 additions & 1 deletion pkg/core/handler/search/search.go
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,6 @@ func SearchForResource(searchMgr *search.SearchManager, aiMgr *ai.AIManager, sea
handler.FailureRender(ctx, w, r, err)
return
}

res, err := aiMgr.ConvertTextToSQL(searchQuery)
if err != nil {
handler.FailureRender(ctx, w, r, err)
Expand Down
3 changes: 3 additions & 0 deletions pkg/core/manager/ai/search.go
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,9 @@ func (a *AIManager) ConvertTextToSQL(query string) (string, error) {
if err != nil {
return "", err
}
if IsInvalidQuery(res) {
return "", ErrInvalidQuery
}
return ExtractSelectSQL(res), nil
}

Expand Down
1 change: 1 addition & 0 deletions pkg/core/manager/ai/types.go
Original file line number Diff line number Diff line change
Expand Up @@ -18,4 +18,5 @@ import "errors"

var (
ErrMissingAuthToken = errors.New("auth token is required")
ErrInvalidQuery = errors.New("query is invalid")
)
10 changes: 9 additions & 1 deletion pkg/core/manager/ai/util.go
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,15 @@

package ai

import "regexp"
import (
"regexp"
"strings"
)

// IsInvalidQuery check if the query is invalid
func IsInvalidQuery(sql string) bool {
return strings.Contains(strings.ToLower(sql), "error")
}

// ExtractSelectSQL extracts SQL statements that start with "SELECT * FROM"
func ExtractSelectSQL(sql string) string {
Expand Down
27 changes: 27 additions & 0 deletions pkg/core/manager/ai/util_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -42,3 +42,30 @@ func TestExtractSelectSQL(t *testing.T) {
})
}
}

// TestIsInvalidQuery tests the IsInvalidQuery function.
func TestIsInvalidQuery(t *testing.T) {
testCases := []struct {
name string
sql string
expected bool
}{
{
name: "ValidQueryWithoutError",
sql: "select * from resources where kind='namespace';",
expected: false,
},
{
name: "InvalidQuery",
sql: "Error",
expected: true,
},
}

for _, tc := range testCases {
t.Run(tc.name, func(t *testing.T) {
actual := IsInvalidQuery(tc.sql)
require.Equal(t, tc.expected, actual)
})
}
}
26 changes: 21 additions & 5 deletions pkg/infra/ai/prompts.go
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ const (
text2sql_prompt = `
You are an AI specialized in writing SQL queries.
Please convert the text %s to sql.
If the text is not accurate enough, please output "Error".
The output tokens only need to give the SQL first, the other thought process please do not give.
The SQL should begin with "select * from" and end with ";".

Expand All @@ -30,23 +31,33 @@ const (
resourceVersion, labels.[key], annotations.[key], content]

2. find the schema_links for generating SQL queries for each question based on the database schema.
If there are Chinese expressions, please translate them into English.

Follow are some examples.

Q: find the kind which is not equal to pod
A: Let’s think step by step. In the question "find the kind column which is not equal to pod", we are asked:
"find the kind" so we need column = [kind]
"find the kind" so we need column = [kind].
Based on the columns, the set of possible cell values are = [pod].
So the Schema_links are:
Schema_links: [kind, pod]

Q: find the kind Deployment which created before January 1, 2024, at 18:00:00
A: Let’s think step by step. In the question "find the kind Deployment which created before January 1, 2024, at 18:00:00", we are asked:
"find the kind Deployment" so we need column = [kind]
"created before" so we need column = [creationTimestamp]
"find the kind Deployment" so we need column = [kind].
"created before" so we need column = [creationTimestamp].
Based on the columns, the set of possible cell values are = [Deployment, 2024-01-01T18:00:00Z].
So the Schema_links are:
Schema_links: [kind, creationTimestamp, Deployment, 2024-01-01T18:00:00Z]
Schema_links: [[kind, Deployment], [creationTimestamp, 2024-01-01T18:00:00Z]]

Q: find the kind Namespace which which created
A: Let’s think step by step. In the question "find the kind", we are asked:
"find the kind Namespace " so we need column = [kind]
"created before" so we need column = [creationTimestamp]
Based on the columns, the set of possible cell values are = [kind, creationTimestamp].
There is no creationTimestamp corresponding cell values, so the text is not accurate enough.
So the Schema_links are:
Schema_links: error

3. Use the the schema links to generate the SQL queries for each of the questions.

Expand All @@ -57,14 +68,19 @@ const (
SQL: select * from resources where kind!='Pod';

Q: find the kind Deployment which created before January 1, 2024, at 18:00:00
Schema_links: [kind, creationTimestamp, Deployment, 2024-01-01T18:00:00Z]
Schema_links: [[kind, Deployment], [creationTimestamp, 2024-01-01T18:00:00Z]]
SQL: select * from resources where kind='Deployment' and creationTimestamp < '2024-01-01T18:00:00Z';

Q: find the namespace which does not contain banan
Schema_links: [namespace, banan]
SQL: select * from resources where namespace notlike 'banan_';

Q: find the kind Namespace which which created
Schema_links: error
Error;

Please convert the text to sql.
If the text is not accurate enough, please output "Error".
`

sql_fix_prompt = `
Expand Down
Loading